Optimizers
Collection of Ivy optimizers.
- class ivy.stateful.optimizers.Adam(lr=0.0001, beta1=0.9, beta2=0.999, epsilon=1e-07, inplace=True, stop_gradients=True, compile_on_next_step=False, device=None)[source]
Bases:
Optimizer
- __init__(lr=0.0001, beta1=0.9, beta2=0.999, epsilon=1e-07, inplace=True, stop_gradients=True, compile_on_next_step=False, device=None)[source]
Construct an ADAM optimizer.
- Parameters
lr (
float
) – Learning rate, default is 1e-4. (default:0.0001
)beta1 (
float
) – gradient forgetting factor, default is 0.9 (default:0.9
)beta2 (
float
) – second moment of gradient forgetting factor, default is 0.999 (default:0.999
)epsilon (
float
) – divisor during adam update, preventing division by zero, default is 1e-07 (default:1e-07
)inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default is True, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default is True.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default is False. (default:False
)device (
Optional
[Union
[Device
,NativeDevice
]]) – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ (default:None
) etc. (Default value = None)
- set_state(state)[source]
Set state of the optimizer.
- Parameters
state (
Container
) – Nested state to update.
- property state
- class ivy.stateful.optimizers.LAMB(lr=0.0001, beta1=0.9, beta2=0.999, epsilon=1e-07, max_trust_ratio=10, decay_lambda=0, inplace=True, stop_gradients=True, compile_on_next_step=False, device=None)[source]
Bases:
Optimizer
- __init__(lr=0.0001, beta1=0.9, beta2=0.999, epsilon=1e-07, max_trust_ratio=10, decay_lambda=0, inplace=True, stop_gradients=True, compile_on_next_step=False, device=None)[source]
Construct an LAMB optimizer.
- Parameters
lr (
float
) – Learning rate, default is 1e-4. (default:0.0001
)beta1 (
float
) – gradient forgetting factor, default is 0.9 (default:0.9
)beta2 (
float
) – second moment of gradient forgetting factor, default is 0.999 (default:0.999
)epsilon (
float
) – divisor during adam update, preventing division by zero, default is 1e-07 (default:1e-07
)max_trust_ratio (
float
) – The max value of the trust ratio; the ratio between the norm of the layer (default:10
) weights and norm of gradients update. Default is 10.decay_lambda (
float
) – The factor used for weight decay. Default is zero. (default:0
)inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default is True, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default is True.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default is False. (default:False
)device (
Optional
[Union
[Device
,NativeDevice
]]) – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ (default:None
) etc. (Default value = None)
- set_state(state)[source]
Set state of the optimizer.
- Parameters
state (
Container
) – Nested state to update.
- property state
- class ivy.stateful.optimizers.LARS(lr=<function LARS.<lambda>>, decay_lambda=0, inplace=True, stop_gradients=True, compile_on_next_step=False)[source]
Bases:
Optimizer
- __init__(lr=<function LARS.<lambda>>, decay_lambda=0, inplace=True, stop_gradients=True, compile_on_next_step=False)[source]
Construct a Layer-wise Adaptive Rate Scaling (LARS) optimizer.
- Parameters
lr (
float
) – Learning rate, default is 1e-4. (default:<function LARS.<lambda> at 0x7fce0cf48af0>
)decay_lambda (
float
) – The factor used for weight decay. Default is zero. (default:0
)inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default is True, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default is True.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default is False. (default:False
)
- set_state(state)[source]
Set state of the optimizer.
- Parameters
state (
Container
) – Nested state to update.
- property state
- class ivy.stateful.optimizers.Optimizer(lr, inplace=True, stop_gradients=True, init_on_first_step=False, compile_on_next_step=False, fallback_to_non_compiled=False, device=None)[source]
Bases:
ABC
- __init__(lr, inplace=True, stop_gradients=True, init_on_first_step=False, compile_on_next_step=False, fallback_to_non_compiled=False, device=None)[source]
Construct a general Optimizer. This is an abstract class, and must be derived.
- Parameters
lr (
float
) – Learning rate.inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default is True, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default is True.init_on_first_step (
bool
) – Whether the optimizer is initialized on the first step. Default is False. (default:False
)compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default is False. (default:False
)fallback_to_non_compiled (
bool
) – Whether to fall back to non-compiled forward call in the case that an error (default:False
) is raised during the compiled forward pass. Default is True.device (
Optional
[Union
[Device
,NativeDevice
]]) – Device on which to create the layer’s variables ‘cuda:0’, ‘cuda:1’, ‘cpu’ (default:None
) etc. (Default value = None)
- abstract set_state(state)[source]
Set state of the optimizer.
- Parameters
state (
Container
) – Nested state to update.
- class ivy.stateful.optimizers.SGD(lr=<function SGD.<lambda>>, inplace=True, stop_gradients=True, compile_on_next_step=False)[source]
Bases:
Optimizer
- __init__(lr=<function SGD.<lambda>>, inplace=True, stop_gradients=True, compile_on_next_step=False)[source]
Construct a Stochastic-Gradient-Descent (SGD) optimizer.
- Parameters
lr (
float
) – Learning rate, default is 1e-4. (default:<function SGD.<lambda> at 0x7fce0cf48820>
)inplace (
bool
) – Whether to update the variables in-place, or to create new variable handles. (default:True
) This is only relevant for frameworks with stateful variables such as PyTorch. Default is True, provided the backend framework supports it.stop_gradients (
bool
) – Whether to stop the gradients of the variables after each gradient step. (default:True
) Default is True.compile_on_next_step (
bool
) – Whether to compile the optimizer on the next step. Default is False. (default:False
)
- set_state(state)[source]
Set state of the optimizer.
- Parameters
state (
Container
) – Nested state to update.
- property state