lars_update#

ivy.lars_update(w, dcdw, lr, /, *, decay_lambda=0, stop_gradients=True, out=None)[source]#

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying Layerwise Adaptive Rate Scaling (LARS) method.

Parameters:
  • w (Union[Array, NativeArray]) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate, the rate at which the weights should be updated relative to the gradient.

  • decay_lambda (float, default: 0) – The factor used for weight decay. Default is zero.

  • stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new function weights ws_new, following the LARS updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([[3., 1, 5],
...                [7, 2, 9]])
>>> dcdw = ivy.array([[0.3, 0.1, 0.2],
...                   [0.1, 0.2, 0.4]])
>>> lr = ivy.array(0.1)
>>> new_weights = ivy.lars_update(w, dcdw, lr)
>>> print(new_weights)
ivy.array([[2.34077978, 0.78025991, 4.56051969],
...        [6.78026009, 1.56051981, 8.12103939]])
>>> w = ivy.array([3., 1, 5])
>>> dcdw = ivy.array([0.3, 0.1, 0.2])
>>> lr = ivy.array(0.1)
>>> out = ivy.zeros_like(dcdw)
>>> ivy.lars_update(w, dcdw, lr, out=out)
>>> print(out)
ivy.array([2.52565837, 0.8418861 , 4.68377209])

With one ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([3.2, 2.6, 1.3]),
...                    b=ivy.array([1.4, 3.1, 5.1]))
>>> dcdw = ivy.array([0.2, 0.4, 0.1])
>>> lr = ivy.array(0.1)
>>> new_weights = ivy.lars_update(w, dcdw, lr)
>>> print(new_weights)
{
    a: ivy.array([3.01132035, 2.22264051, 1.2056601]),
    b: ivy.array([1.1324538, 2.56490755, 4.96622658])
}

With multiple ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([3.2, 2.6, 1.3]),
...                    b=ivy.array([1.4, 3.1, 5.1]))
>>> dcdw = ivy.Container(a=ivy.array([0.2, 0.4, 0.1]),
...                       b=ivy.array([0.3,0.1,0.2]))
>>> lr = ivy.array(0.1)
>>> new_weights = ivy.lars_update(w, dcdw, lr)
>>> print(new_weights)
{
    a: ivy.array([3.01132035, 2.22264051, 1.2056601]),
    b: ivy.array([0.90848625, 2.93616199, 4.77232409])
}
Array.lars_update(self, dcdw, lr, /, *, decay_lambda=0, stop_gradients=True, out=None)[source]#

ivy.Array instance method variant of ivy.lars_update. This method simply wraps the function, and so the docstring for ivy.lars_update also applies to this method with minimal changes.

Parameters:
  • self (Array) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray]) – Learning rate, the rate at which the weights should be updated relative to the gradient.

  • decay_lambda (float, default: 0) – The factor used for weight decay. Default is zero.

  • stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.

  • out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new function weights ws_new, following the LARS updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([[3., 1, 5],
...                [7, 2, 9]])
>>> dcdw = ivy.array([[0.3, 0.1, 0.2],
...                   [0.1, 0.2, 0.4]])
>>> lr = ivy.array(0.1)
>>> new_weights = w.lars_update(dcdw, lr, stop_gradients = True)
>>> print(new_weights)
ivy.array([[2.34077978, 0.78025991, 4.56051969],
...        [6.78026009, 1.56051981, 8.12103939]])
Container.lars_update(self, dcdw, lr, /, *, decay_lambda=0, stop_gradients=True, out=None)[source]#

Update weights ws of some function, given the derivatives of some cost c with respect to ws, [dc/dw for w in ws], by applying Layerwise Adaptive Rate Scaling (LARS) method.

Parameters:
  • self (Container) – Weights of the function to be updated.

  • dcdw (Union[Array, NativeArray, Container]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].

  • lr (Union[float, Array, NativeArray, Container]) – Learning rate, the rate at which the weights should be updated relative to the gradient.

  • decay_lambda (Union[float, Container], default: 0) – The factor used for weight decay. Default is zero.

  • stop_gradients (Union[bool, Container], default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.

  • out (Optional[Container], default: None) – optional output container, for writing the result to. It must have a shape that the inputs broadcast to.

Returns:

ret – The new function weights ws_new, following the LARS updates.

Examples

With one ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([3.2, 2.6, 1.3]),
...                    b=ivy.array([1.4, 3.1, 5.1]))
>>> dcdw = ivy.array([0.2, 0.4, 0.1])
>>> lr = ivy.array(0.1)
>>> new_weights = w.lars_update(dcdw, lr)
>>> print(new_weights)
{
    a: ivy.array([3.01132035, 2.22264051, 1.2056601]),
    b: ivy.array([1.1324538, 2.56490755, 4.96622658])
}

With multiple ivy.Container inputs:

>>> w = ivy.Container(a=ivy.array([3.2, 2.6, 1.3]),
...                    b=ivy.array([1.4, 3.1, 5.1]))
>>> dcdw = ivy.Container(a=ivy.array([0.2, 0.4, 0.1]),
...                       b=ivy.array([0.3,0.1,0.2]))
>>> lr = ivy.array(0.1)
>>> new_weights = w.lars_update(dcdw, lr)
>>> print(new_weights)
{
    a: ivy.array([3.01132035, 2.22264051, 1.2056601]),
    b: ivy.array([0.90848625, 2.93616199, 4.77232409])
}