Gradients

Collection of gradient Ivy functions.

class ivy.GradientTracking(with_grads)[source]

Bases: object

__init__(with_grads)[source]
ivy.adam_step(dcdw, mw, vw, step, beta1=0.9, beta2=0.999, epsilon=1e-07)[source]

Compute adam step delta, given the derivatives of some cost c with respect to weights ws, using ADAM update. `[reference]

<https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam>`_

Parameters
  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • mw (Union[Array, NativeArray]) – running average of the gradients

  • vw (Union[Array, NativeArray]) – running average of second moments of the gradients

  • step (Union[int, float]) – training step

  • beta1 – gradient forgetting factor (Default value = 0.9)

  • beta2 – second moment of gradient forgetting factor (Default value = 0.999)

  • epsilon – divisor during adam update, preventing division by zero (Default value = 1e-7)

Return type

Array

Returns

ret – The adam step delta.

Functional Examples

With ivy.Array inputs:

>>> dcdw = ivy.array([1, 2, 3])
>>> mw = ivy.zeros(3)
>>> vw = ivy.zeros(1)
>>> step = ivy.array(3)
>>> adam_step_delta = ivy.adam_step(dcdw, mw, vw, step)
>>> print(adam_step_delta)
    (ivy.array([0.639, 0.639, 0.639]),
    ivy.array([0.1, 0.2, 0.3]),
    ivy.array([0.001, 0.004, 0.009]))
>>> dcdw = ivy.array([[1., 4., -3.], [2., 3., 0.5]])
>>> mw = ivy.zeros((2,3))
>>> vw = ivy.zeros(3)
>>> step = ivy.array(1)
>>> beta1 = 0.86
>>> beta2 = 0.95
>>> epsilon = 1e-6
>>> adam_step_delta = ivy.adam_step(dcdw, mw, vw, step, beta1, beta2, epsilon)
>>> print(adam_step_delta)
    (ivy.array([[1., 1., -1.],
                [1., 1., 1.]]),
    ivy.array([[ 0.14, 0.56, -0.42],
              [ 0.28, 0.42, 0.07]]),
    ivy.array([[0.05, 0.8, 0.45],
              [0.2, 0.45, 0.0125]]))
>>> dcdw = ivy.array([1, -2, 3])
>>> mw = ivy.zeros(1)
>>> vw = ivy.zeros(1)
>>> step = ivy.array(3.6)
>>> adam_step_delta = ivy.adam_step(dcdw, mw, vw, step)
>>> print(adam_step_delta)
    (ivy.array([ 0.601, -0.601, 0.601]),
    ivy.array([ 0.1, -0.2, 0.3]),
    ivy.array([0.001, 0.004, 0.009]))

With ivy.NativeArray inputs:

>>> dcdw = ivy.native_array([2, 3, 5])
>>> mw = ivy.native_array([0, 0, 0])
>>> vw = ivy.native_array([0])
>>> step = ivy.native_array([4])
>>> adam_step_delta = ivy.adam_step(dcdw, mw, vw, step)
>>> print(adam_step_delta)
    (ivy.array([0.581, 0.581, 0.581]),
    ivy.array([0.2, 0.3, 0.5]),
    ivy.array([0.004, 0.009, 0.025]))
>>> dcdw = ivy.native_array([3., -4., 1., 0., 2., -3., 2.6,])
>>> mw = ivy.zeros([7])
>>> vw = ivy.native_array([1])
>>> step = ivy.native_array([2])
>>> beta1 = 0.76
>>> beta2 = 0.992
>>> epsilon = 1e-5
>>> adam_step_delta = ivy.adam_step(dcdw, mw, vw, step, beta1, beta2, epsilon)
>>> print(adam_step_delta)
    (ivy.array([0.209, -0.271, 0.0717, 0., 0.142, -0.209, 0.182]),
     ivy.array([ 0.72, -0.96, 0.24, 0., 0.48, -0.72, 0.624]),
     ivy.array([1.06, 1.12, 1., 0.992, 1.02, 1.06, 1.05]))

with mixture of both ivy.NativeArray and :code:’ivy.Array’ inputs:

>>> dcdw = ivy.array([1, 2, 3])
>>> mw = ivy.native_array([0, 0, 0])
>>> vw = ivy.zeros(1)
>>> step = ivy.native_array([2])
>>> adam_step_delta = ivy.adam_step(dcdw, mw, vw, step)
>>> print(adam_step_delta)
    (ivy.array([0.744, 0.744, 0.744]),
    ivy.array([0.1, 0.2, 0.3]),
    ivy.array([0.001, 0.004, 0.009]))

with :code: ivy.container inputs:

>>> dcdw = ivy.Container(a=ivy.array([0., 1., 2.]),                             b=ivy.array([3., 4., 5.]))
>>> mw = ivy.Container(a=ivy.array([0., 0., 0.]),                           b=ivy.array([0., 0., 0.]))
>>> vw = ivy.Container(a=ivy.array([0.,]),                           b=ivy.array([0.,]))
>>> step = ivy.array([3.4])
>>> beta1 = 0.87
>>> beta2 = 0.976
>>> epsilon = 1e-5
>>> adam_step_delta = ivy.adam_step(dcdw, mw, vw, step, beta1, beta2, epsilon)
>>> print(adam_step_delta)
({
    a: ivy.array([0., 0.626, 0.626]),
    b: ivy.array([0.626, 0.626, 0.626])
}, {
    a: ivy.array([0., 0.13, 0.26]),
    b: ivy.array([0.39, 0.52, 0.65])
}, {
    a: ivy.array([0., 0.024, 0.096]),
    b: ivy.array([0.216, 0.384, 0.6])
})
ivy.adam_update(w, dcdw, lr, mw_tm1, vw_tm1, step, beta1=0.9, beta2=0.999, epsilon=1e-07, inplace=None, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, using ADAM update. `[reference]

<https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam>`_

Parameters
  • w (Union[Array, NativeArray]) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • mw_tm1 (Union[Array, NativeArray]) – running average of the gradients, from the previous time-step.

  • vw_tm1 (Union[Array, NativeArray]) – running average of second moments of the gradients, from the previous time-step.

  • step (int) – training step

  • beta1 – gradient forgetting factor (Default value = 0.9)

  • beta2 – second moment of gradient forgetting factor (Default value = 0.999)

  • epsilon – divisor during adam update, preventing division by zero (Default value = 1e-7)

  • inplace – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True, provided the backend framework supports it.

  • stop_gradients – Whether to stop the gradients of the variables after each gradient step. Default is True.

Return type

Array

Returns

ret – The new function weights ws_new, and also new mw and vw, following the adam updates.

ivy.execute_with_gradients(func, xs, retain_grads=False)[source]

Call function func with input of xs variables, and return func first output y, the gradients [dy/dx for x in xs], and any other function outputs after the returned y value.

Parameters
  • func – Function for which we compute the gradients of the output with respect to xs input.

  • xs – Variables for which to compute the function gradients with respective to.

  • retain_grads – Whether to retain the gradients of the returned values. (Default value = False)

Returns

ret – the function first output y, the gradients [dy/dx for x in xs], and any other extra function outputs

ivy.grad(func)[source]

Call function func, and return func’s gradients.

Parameters

func – Function for which we compute the gradients of the output with respect to xs input.

Returns

ret – the grad function

ivy.gradient_descent_update(w, dcdw, lr, inplace=None, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws].

Parameters
  • w (Union[Array, NativeArray]) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • inplace – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True, provided the backend framework supports it.

  • stop_gradients – Whether to stop the gradients of the variables after each gradient step. Default is True.

Return type

Array

Returns

ret – The new weights, following the gradient descent updates.

Functional Examples

With ivy.Array inputs: >>> w = ivy.array([[1., 2, 3], [4, 6, 1], [1, 0, 7]]) >>> dcdw = ivy.array([[0.5, 0.2, 0.1], [0.3, 0.6, 0.4], [0.4, 0.7, 0.2]]) >>> lr = ivy.array(0.1) >>> NewWeights = ivy.gradient_descent_update(w, dcdw, lr, inplace=False, stop_gradients=True) >>> print(NewWeights)

ivy.array([[ 0.95, 1.98, 2.99],

[ 3.97, 5.94, 0.96], [ 0.96, -0.07, 6.98]])

>>> w = ivy.array([1., 2., 3.])
>>> dcdw = ivy.array([0.5, 0.2, 0.1])
>>> lr = ivy.array(0.3)
>>> ivy.gradient_descent_update(w, dcdw, lr, inplace=True)
>>> print(w)
    ivy.array([0.85, 1.94, 2.97])

With :code: ivy.container inputs:

>>> w = ivy.Container(a=ivy.array([1., 2., 3.]),                          b=ivy.array([3.48, 5.72, 1.98]))
>>> dcdw = ivy.Container(a=ivy.array([0.5, 0.2, 0.1]),                             b=ivy.array([2., 3.42, 1.69]))
>>> lr = ivy.array(0.3)
>>> ivy.gradient_descent_update(w, dcdw, lr, inplace=True)
>>> print(w)
    {
        a: ivy.array([0.85, 1.94, 2.97]),
        b: ivy.array([2.88, 4.69, 1.47])
    }
ivy.is_variable(x, exclusive=False)[source]

Determines whether the input is a variable or not.

Parameters
  • x (Union[Array, NativeArray]) – An ivy array.

  • exclusive (bool) – Whether to check if the data type is exclusively a variable, rather than an (default: False) array. For frameworks like JAX that do not have exclusive variable types, the function will always return False if this flag is set, otherwise the check is the same for general arrays. Default is False.

Return type

bool

Returns

ret – Boolean, true if x is a trainable variable, false otherwise.

Examples

With ivy.Array input:

>>> x = ivy.array(2.3)
>>> is_var = ivy.is_variable(x)
>>> print(is_var)
    False
>>> x = ivy.zeros((3, 2))
>>> is_var = ivy.is_variable(x)
>>> print(is_var)
    False
>>> x = ivy.array([[2], [3], [5]])
>>> is_var = ivy.is_variable(x, True)
>>> print(is_var)
    False

With ivy.NativeArray input:

>>> x = ivy.native_array([7])
>>> is_var = ivy.is_variable(x)
>>> print(is_var)
    False
>>> x = ivy.native_array([2, 3, 4])
>>> is_var = ivy.is_variable(x)
>>> print(is_var)
    False
>>> x = ivy.native_array([-1, 0., 0.8, 9])
>>> is_var =  ivy.is_variable(x, True)
>>> print(is_var)
    False

With ivy.Container input:

>>> x = ivy.Container(a = ivy.array(3.2), b=ivy.array(2))
>>> exclusive = True
>>> is_var = ivy.is_variable(x, exclusive=exclusive)
>>> print(is_var)
{
    a: false,
    b: false
}

With multiple ivy.Container inputs:

>>> x = ivy.Container(a=ivy.array([2, -1, 0]), b=ivy.array([0., -0.4, 8]))
>>> exclusive = ivy.Container(a=False, b=True)
>>> is_var = ivy.is_variable(x, exclusive=exclusive)
>>> print(is_var)
{
    a: false,
    b: false
}
ivy.jac(func)[source]

Call function func, and return func’s Jacobian partial derivatives.

Parameters

func – Function for which we compute the gradients of the output with respect to xs input.

Returns

ret – the Jacobian function

ivy.lamb_update(w, dcdw, lr, mw_tm1, vw_tm1, step, beta1=0.9, beta2=0.999, epsilon=1e-07, max_trust_ratio=10, decay_lambda=0, inplace=None, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying LAMB method.

Parameters
  • w (Union[Array, NativeArray]) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • mw_tm1 (Union[Array, NativeArray]) – running average of the gradients, from the previous time-step.

  • vw_tm1 (Union[Array, NativeArray]) – running average of second moments of the gradients, from the previous time-step.

  • step (Union[int, float]) – training step

  • beta1 (Union[int, float]) – gradient forgetting factor. (Default value = 0.9) (default: 0.9)

  • beta2 (Union[int, float]) – second moment of gradient forgetting factor. (Default value = 0.999) (default: 0.999)

  • epsilon (float) – divisor during lamb update, preventing division by zero. (Default value = 1e-7) (default: 1e-07)

  • max_trust_ratio (Union[int, float]) – The maximum value for the trust ratio. (Default value = 10) (default: 10)

  • decay_lambda (Union[int, float]) – The factor used for weight decay. (Default value = 0) (default: 0)

  • inplace (Optional[bool]) – Whether to perform the operation inplace, for backends which support inplace (default: None) variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True, provided the backend framework supports it.

  • stop_gradients (bool) – Whether to stop the gradients of the variables after each gradient step. (default: True) Default is True.

Return type

Array

Returns

ret – The new function weights ws_new, following the lamb updates.

Functional Examples

With ivy.Array inputs:

>>> w = ivy.array([1., 2, 3])
>>> dcdw = ivy.array([0.5,0.2,0.1])
>>> lr = ivy.array(0.1)
>>> vw_tm1 = ivy.zeros(1)
>>> mw_tm1 = ivy.zeros(3)
>>> step = ivy.array(1)
>>> new_weights = ivy.lamb_update(w,dcdw,lr,mw_tm1,vw_tm1,step)
>>> print(new_weights)
(ivy.array([0.784, 1.78 , 2.78 ]), ivy.array([0.05, 0.02, 0.01]),     ivy.array([2.5e-04, 4.0e-05, 1.0e-05]))
>>> w = ivy.array([[1., 2, 3],[4, 6, 1],[1, 0, 7]])
>>> dcdw = ivy.array([[0.5, 0.2, 0.1],[0.3, 0.6, 0.4],[0.4, 0.7, 0.2]])
>>> lr = ivy.array(0.1)
>>> mw_tm1 = ivy.zeros((3,3))
>>> vw_tm1 = ivy.zeros(3)
>>> step = ivy.array(1)
>>> beta1=0.9
>>> beta2=0.999
>>> epsilon=1e-7
>>> max_trust_ratio=10
>>> decay_lambda=0
>>> inplace=None
>>> stop_gradients=True
>>> new_weights = ivy.lamb_update(w,dcdw,lr,mw_tm1,vw_tm1,step,beta1,beta2,                      epsilon,max_trust_ratio,decay_lambda,inplace,stop_gradients)
>>> print(new_weights)
(ivy.array([[ 0.639,  1.64 ,  2.64 ],
    [ 3.64 ,  5.64 ,  0.639],
    [ 0.639, -0.361,  6.64 ]]), ivy.array([[0.05, 0.02, 0.01],
    [0.03, 0.06, 0.04],
    [0.04, 0.07, 0.02]]), ivy.array([[2.5e-04, 4.0e-05, 1.0e-05],
    [9.0e-05, 3.6e-04, 1.6e-04],
    [1.6e-04, 4.9e-04, 4.0e-05]]))
>>> w = ivy.array([[1.,2,3],[4,6,2],[1,0,4]])
>>> dcdw = ivy.array([[0.5,0.3,0.2],[0.1,0.8,0.3],[0.6,0.7,0.1]])
>>> lr=ivy.array(0.1)
>>> mw_tm1 = ivy.zeros((3,3))
>>> vw_tm1 = ivy.zeros(3)
>>> step = ivy.array(1)
>>> beta1=0.8
>>> beta2=0.76
>>> epsilon=1e-4
>>> max_trust_ratio=5.7
>>> decay_lambda=1
>>> inplace=False
>>> stop_gradients = False
>>> new_weights = ivy.lamb_update(w,dcdw,lr,mw_tm1,vw_tm1,step,beta1,beta2,                        epsilon,max_trust_ratio,decay_lambda,inplace,stop_gradients)
>>> print(new_weights)
(ivy.array([[ 0.922 ,  1.92  ,  2.92  ],
    [ 3.92  ,  5.92  ,  1.92  ],
    [ 0.922 , -0.0782,  3.92  ]]), ivy.array([[0.1 , 0.06, 0.04],
    [0.02, 0.16, 0.06],
    [0.12, 0.14, 0.02]]), ivy.array([[0.06  , 0.0216, 0.0096],
    [0.0024, 0.154 , 0.0216],
    [0.0864, 0.118 , 0.0024]]))

With :code: ivy.container inputs:

>>> w = ivy.Container(a=ivy.array([1.,2.,3.]))
>>> dcdw = ivy.Container(a=ivy.array([3.,4.,5.]))
>>> mw_tm1 = ivy.Container(a=ivy.array([0.,0.,0.]))
>>> vw_tm1 = ivy.Container(a=ivy.array([0.]))
>>> lr = ivy.array(1.)
>>> step=ivy.array([2])
>>> new_weights = ivy.lamb_update(w,dcdw,mw_tm1,vw_tm1,lr,step)
>>> print(new_weights)
({
    a: ivy.array([1., 2., 3.])
}, {
    a: ivy.array([0.3, 0.4, 0.5])
}, {
    a: ivy.array([1.01, 1.01, 1.02])
})
>>> w = ivy.Container(a=ivy.array([1.,3.,5.]),                          b = ivy.array([3.,4.,2.]))
>>> dcdw = ivy.Container(a=ivy.array([0.2,0.3,0.6]),                             b=ivy.array([0.6,0.4,0.7]))
>>> mw_tm1 = ivy.Container(a=ivy.array([0.,0.,0.]),                               b=ivy.array([0.,0.,0.]))
>>> vw_tm1 = ivy.Container(a=ivy.array([0.,]),                               b=ivy.array([0.,]))
>>> step = ivy.array([3.4])
>>> beta1 = 0.9
>>> beta2 = 0.999
>>> epsilon = 1e-7
>>> max_trust_ratio=10
>>> decay_lambda=0
>>> inplace=None
>>> stop_gradients=True
>>> lr =ivy.array(0.5)
>>> new_weights = ivy.lamb_update(w,dcdw,lr,mw_tm1,vw_tm1,step,beta1,beta2,                        epsilon,max_trust_ratio,decay_lambda,inplace,stop_gradients)
>>> print(new_weights)
({
    a: ivy.array([-0.708, 1.29, 3.29]),
    b: ivy.array([1.45, 2.45, 0.445])
}, {
    a: ivy.array([0.02, 0.03, 0.06]),
    b: ivy.array([0.06, 0.04, 0.07])
}, {
    a: ivy.array([4.0e-05, 9.0e-05, 3.6e-04]),
    b: ivy.array([0.00036, 0.00016, 0.00049])
})
ivy.lars_update(w, dcdw, lr, decay_lambda=0, inplace=None, stop_gradients=True)[source]

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying Layerwise Adaptive Rate Scaling (LARS) method.

Parameters
  • w (Union[Array, NativeArray]) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate, the rate at which the weights should be updated relative to the gradient.

  • decay_lambda – The factor used for weight decay. Default is zero.

  • inplace – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True, provided the backend framework supports it.

  • stop_gradients – Whether to stop the gradients of the variables after each gradient step. Default is True.

Return type

Array

Returns

ret – The new function weights ws_new, following the LARS updates.

ivy.optimizer_update(w, effective_grad, lr, inplace=None, stop_gradients=True)[source]

Update weights ws of some function, given the true or effective derivatives of some cost c with respect to ws, [dc/dw for w in ws].

Parameters
  • w (Union[Array, NativeArray]) – Weights of the function to be updated.

  • effective_grad (Union[Array, NativeArray]) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.

  • inplace – Whether to perform the operation inplace, for backends which support inplace variable updates, and handle gradients behind the scenes such as PyTorch. If the update step should form part of a computation graph (i.e. higher order optimization), then this should be set to False. Default is True, provided the backend framework supports it.

  • stop_gradients – Whether to stop the gradients of the variables after each gradient step. Default is True.

Return type

Array

Returns

ret – The new function weights ws_new, following the optimizer updates.

Functional Examples

With ivy.Array inputs:

>>> w = ivy.array([1., 2., 3.])
>>> effective_grad = ivy.zeros(3)
>>> lr = 3e-4
>>> ws_new = ivy.optimizer_update(w=w, effective_grad=effective_grad, lr=lr)
>>> print(ws_new)
ivy.array([1., 2., 3.])
>>> w = ivy.array([1., 2., 3.])
>>> effective_grad = ivy.zeros(3)
>>> lr = 3e-4
>>> ws_new = ivy.optimizer_update(w=w, effective_grad=effective_grad, lr=lr,                                    inplace=False, stop_gradients=True)
>>> print(ws_new)
ivy.array([1., 2., 3.])
>>> w = ivy.array([[1., 2.],[4., 5.]])
>>> effective_grad = ivy.array([[4., 5.],[7., 8.]])
>>> lr = ivy.array([3e-4, 1e-2])
>>> ws_new = ivy.optimizer_update(w=w, effective_grad=effective_grad, lr=lr,                                    inplace=False, stop_gradients=True)
>>> print(ws_new)
ivy.array([[0.999, 1.95],
           [4., 4.92]])
>>> w = ivy.array([1., 2., 3.])
>>> effective_grad = ivy.array([4., 5., 6.])
>>> lr = ivy.array([3e-4])
>>> ws_new = ivy.optimizer_update(w=w, effective_grad=effective_grad, lr=lr,                                    stop_gradients=False, inplace=True)
>>> print(ws_new)
ivy.array([0.999, 2., 3.])

with :code: ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),                        b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]),                                    b=ivy.array([0., 0., 0.]))
>>> lr = 3e-4
>>> ws_new = ivy.optimizer_update(w=w, effective_grad=effective_grad, lr=lr)
>>> print(ws_new)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}
>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),                        b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]),                                    b=ivy.array([0., 0., 0.]))
>>> lr = 3e-4
>>> ws_new = ivy.optimizer_update(w=w, effective_grad=effective_grad, lr=lr,                                    stop_gradients=True, inplace=True)
>>> print(ws_new)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}
>>> w = ivy.Container(a=ivy.array([0., 1., 2.]),                        b=ivy.array([3., 4., 5.]))
>>> effective_grad = ivy.Container(a=ivy.array([0., 0., 0.]),                                    b=ivy.array([0., 0., 0.]))
>>> lr = ivy.array([3e-4])
>>> ws_new = ivy.optimizer_update(w=w, effective_grad=effective_grad, lr=lr,                                    stop_gradients=False, inplace=False)
>>> print(ws_new)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}
ivy.set_with_grads(with_grads)[source]

Enter a nested code space where gradients are computed. This method adds the with_grads component to the global list with_grads_stack

Parameters

with_grads (bool) – Boolean value denoting whether the current code block has gradient computation enabled or not. ‘True’ or ‘False’ or ‘None’ (Default value = None)

Returns

ret – If with_grads is boolean, it returns the boolean value representing if gradient computation is enabled or not. If with_grads is None, it returns the last element in the with_grads_stack representing the parent of the current nested code block. If with_grads_stack is empty, it returns True by default. If with_grads is neither None nor boolean, it will raise an AssertionError

Examples

>>> ivy.set_with_grads(True)
>>> print(ivy.with_grads(with_grads=None))
True
>>> ivy.set_with_grads(False)
>>> print(ivy.with_grads(with_grads=None))
False
>>> print(ivy.with_grads(with_grads=True))
True
>>> print(ivy.with_grads(with_grads=False))
False
ivy.stop_gradient(x, preserve_type=True, *, out=None)[source]

Stops gradient computation.

Parameters
  • x (Union[Array, NativeArray]) – Array for which to stop the gradient.

  • preserve_type (bool) – Whether to preserve the input type (ivy.Variable or ivy.Array), (default: True) otherwise an array is always returned. Default is True.

  • preserve_type – bool, optional (Default value = True)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The same array x, but with no gradient information.

Functional Examples

With ivy.Array inputs:

>>> x = ivy.array([1., 2., 3.])
>>> y = ivy.stop_gradient(x, preserve_type=True)
>>> print(y)
ivy.array([1., 2., 3.])
>>> x = ivy.zeros((2, 3))
>>> ivy.stop_gradient(x, preserve_type=False, out=x)
>>> print(x)
ivy.array([[0., 0., 0.],
           [0., 0., 0.]])

with :code: ivy.Container inputs:

>>> x = ivy.Container(a=ivy.array([0., 1., 2.]),                        b=ivy.array([3., 4., 5.]))
>>> y = ivy.stop_gradient(x, preserve_type=False)
>>> print(y)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}
>>> x = ivy.Container(a=ivy.array([0., 1., 2.]),                        b=ivy.array([3., 4., 5.]))
>>> ivy.stop_gradient(x, preserve_type=True, out=x)
>>> print(x)
{
    a: ivy.array([0., 1., 2.]),
    b: ivy.array([3., 4., 5.])
}
ivy.unset_with_grads()[source]
ivy.value_and_grad(func)[source]

Create a function that evaluates both func and the gradient of func.

Parameters

func – Function for which we compute the gradients of the output with respect to xs input.

Returns

ret – A function that returns both func and the gradient of func.

ivy.variable(x)[source]

Creates a variable, which supports gradient computation.

Parameters

x (Union[Array, NativeArray]) – An ivy array.

Return type

Variable

Returns

ret – An ivy variable, supporting gradient computation.

Examples

With ivy.Array input:

>>> ivy.set_backend('torch')
>>> x = ivy.array([1., 0.3, -4.5])
>>> y = ivy.variable(x)
>>> y
ivy.array([ 1. ,  0.3, -4.5])
>>> ivy.unset_backend()

With ivy.NativeArray input:

>>> ivy.set_backend('jax')
>>> x = ivy.native_array([0.2, 2., 3.])
>>> y = ivy.variable(x)
>>> y
ivy.array([0.2, 2., 3.])
>>> ivy.unset_backend()

With ivy.Container input:

>>> ivy.set_backend('tensorflow')
>>> x = ivy.Container(a=ivy.array([1., 2.]), b=ivy.array([-0.2, 4.]))
>>> y = ivy.variable(x)
>>> y
{
    a: ivy.array([1., 2.]),
    b: ivy.array([-0.2, 4.])
}
>>> ivy.unset_backend()
ivy.variable_data(x)[source]

Some backends wrap arrays in a dedicated variable class. For those frameworks, this function returns that wrapped array. For frameworks which do not have a dedicated variable class, the function returns the data passed in.

Parameters

x – An ivy variable.

Returns

ret – The internal data stored by the variable

ivy.with_grads(with_grads=None)[source]

Enter a nested code space where gradients are computed. This method adds the with_grads component to the global list with_grads_stack

Parameters

with_grads (Optional[bool]) – Boolean value denoting whether the current code block has gradient (default: None) computation enabled or not. ‘True’ or ‘False’ or ‘None’ (Default value = None)

Return type

bool

Returns

ret – If with_grads is boolean, it returns the boolean value representing if gradient computation is enabled or not. If with_grads is None, it returns the last element in the with_grads_stack representing the parent of the current nested code block. If with_grads_stack is empty, it returns True by default. If with_grads is neither None nor boolean, it will raise an AssertionError

Examples

>>> ivy.set_with_grads(True)
>>> print(ivy.with_grads(with_grads=None))
True
>>> ivy.set_with_grads(False)
>>> print(ivy.with_grads(with_grads=None))
False
>>> print(ivy.with_grads(with_grads=True))
True
>>> print(ivy.with_grads(with_grads=False))
False