Layers

Collection of Ivy neural network layers in functional form.

ivy.conv(x, filters, strides, padding, /, *, transpose=False, dims=2, output_shape=None, data_format='channel_last', feature_group_count=1, x_dilations=1, dilations=1, out=None)[source]
Return type

Array

ivy.conv1d(x, filters, strides, padding, /, *, data_format='NWC', dilations=1, out=None)[source]

Computes a 1-D convolution given 3-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fw,d_in,d_out].

  • strides (int) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str) – NWC” or “NCW”. Defaults to “NWC”. (default: 'NWC')

  • dilations (int) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the convolution operation.

Examples

With ivy.Array input:

>>> x = ivy.asarray([[[0.], [3.], [0.]]]) #NWC
>>> filters = ivy.array([[[0.]], [[1.]], [[0.]]]) #WIO
>>> result = ivy.conv1d(x, filters, (1,), 'SAME', data_format='NWC',dilations= (1,))
>>> print(result)
ivy.array([[[0.], [3.], [0.]]])

With ivy.NativeArray input:

>>> x = ivy.native_array([[[1., 3.], [2., 4.], [5., 7]]])
>>> filters = ivy.native_array([[[0., 1.], [1., 0.]]])
>>> result = ivy.conv1d(x, filters, (2,),'VALID')
>>> print(result)
ivy.array([[[3., 1.],
...         [7., 5.]]])

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.Container(a=ivy.array([[[1.2, 3.1, 4.8], [5.9, 2.2, 3.3],
...                                 [10.8, 7.6, 4.9], [6.1, 2.2, 9.5]]]),
...                   b=ivy.array([[[8.8, 7.7, 6.6], [1.1, 2.2, 3.5]]]))
>>> filters = ivy.array([[[1., 0., 1.], [0., 1., 0.], [1., 1., 0.]]])
>>> result  = ivy.conv1d(x, filters, 3, 'VALID')
>>> print(result)
{
        a: ivy.array([[[6., 7.9, 1.2],
...                    [15.6, 11.7, 6.1]]]),
...     b: ivy.array([[[15.4, 14.3, 8.8]]])
}
ivy.conv1d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NWC', dilations=1, out=None)[source]

Computes a 1-D transpose convolution given 3-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fw,d_in,d_out].

  • strides (int) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • output_shape (Optional[Union[Shape, NativeShape]]) – Shape of the output (Default value = None) (default: None)

  • data_format (str) – NWC” or “NCW”. Defaults to “NWC”. (default: 'NWC')

  • dilations (int) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the transpose convolution operation.

ivy.conv2d(x, filters, strides, padding, /, *, data_format='NHWC', dilations=1, out=None)[source]

Computes a 2-D convolution given 4-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int], Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str) – NHWC” or “NCHW”. Defaults to “NHWC”. (default: 'NHWC')

  • dilations (Optional[Union[int, Tuple[int], Tuple[int, int]]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

  • ret – The result of the convolution operation.

  • Both the description and the type hints above assumes an array input for simplicity,

  • but this function is nestable, and therefore also accepts ivy.Container

  • instances in place of any of the arguments.

Examples

With ivy.Array input:

>>> x = ivy.array([[[[1.], [2.0],[3.]],
...                 [[1.], [2.0],[3.]],
...                 [[1.], [2.0],[3.]]]]) #NHWC
>>> filters = ivy.array([[[[0.]],[[1.]],[[0.]]],
...                      [[[0.]],[[1.]], [[0.]]],
...                      [[[0.]],[[1.]], [[0.]]]]) #HWIO
>>> result = ivy.conv2d(x, filters, (1,), 'SAME', data_format='NHWC',
... dilations= (1,))
>>> print(result)
ivy.array([[
          [[2.],[4.],[6.]],
          [[3.],[6.],[9.]],
          [[2.],[4.],[6.]]
          ]])

With one ivy.Container input:

>>> x = ivy.Container(a=ivy.array([[[[1.], [2.0],[3.]],
...                                 [[1.], [2.0],[3.]],
...                                 [[1.], [2.0],[3.]]]]))
>>> filters = ivy.eye(3, 3).reshape((3, 3, 1, 1)).astype(ivy.float32)
>>> result = ivy.conv2d(x, filters, (2,), 'SAME', data_format='NHWC',
...    dilations= (1,))
>>> print(result)
{
    a:ivy.array([[[[3.], [3.]], [[1.], [5.]]]])
}

With multiple ivy.Container inputs:

>>> x = ivy.Container(a = ivy.eye(3, 3).reshape((1, 3, 3, 1)),
...                   b = ivy.eye(4, 4).reshape((1, 4, 4, 1)),
...                   c = ivy.eye(5, 5).reshape((1, 5, 5, 1)))
>>> filters = ivy.array([[1, 1, 1],
...                      [0, 1, 1],
...                      [0, 0, 1]], dtype = ivy.float32).reshape((3, 3, 1, 1))
>>> result = ivy.conv2d(x, filters, (2,), 'SAME')
>>> print(result)
{
    a:ivy.array([[[[2.], [0.]], [[1.], [2.]]]]),
    b:ivy.array([[[[3.], [0.]], [[1.], [2.]]]]),
    c:ivy.array([[[[2.], [0.], [0.]],
                  [[1.], [3.], [0.]],
                  [[0.], [1.], [2.]]
                ]])
}

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.Container(a = ivy.eye(3, 3).reshape((1, 3, 3, 1)),
...                   b = ivy.eye(5, 5).reshape((1, 5, 5, 1)))
>>> filters = ivy.array([[2, 0, 1],
...                      [1, 3, 1],
...                      [0, 1, 1]], dtype = ivy.float32).reshape((3, 3, 1, 1))
>>> result = ivy.conv2d(x, filters, (2,), 'SAME')
>>> print(result)
{
    a:ivy.array([[[[4.],[0.]],[[1.],[5.]]]]),
    b:ivy.array([[[[4.],[0.],[0.]],[[1.],[6.],[0.]],[[0.],[1.],[5.]]]])
}
ivy.conv2d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NHWC', dilations=1, out=None)[source]

Computes a 2-D transpose convolution given 4-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int], Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • output_shape (Optional[Union[Shape, NativeShape]]) – Shape of the output (Default value = None) (default: None)

  • data_format (str) – NHWC” or “NCHW”. Defaults to “NHWC”. (default: 'NHWC')

  • dilations (Union[int, Tuple[int], Tuple[int, int]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the transpose convolution operation.

ivy.conv3d(x, filters, strides, padding, /, *, data_format='NDHWC', dilations=1, out=None)[source]

Computes a 3-D convolution given 5-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input volume [batch_size,d,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fd,fh,fw,d_in,d_out].

  • strides (int) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str) – NDHWC” or “NCDHW”. Defaults to “NDHWC”. (default: 'NDHWC')

  • dilations (int) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the convolution operation.

Examples

>>> x1 = [[[1.],[2.]],[[1.],[2.]],[[1.],[2.]]]
>>> x2 = [[[3.],[4.]],[[3.],[4.]],[[3.],[4.]]]
>>> x = ivy.array([[x1,x2]]) #NDHWC
>>> filters = ivy.array([[[[[1]],[[0.]]]]]) #DHWIO
>>> result = ivy.conv3d( x, filters, 1, 'VALID',data_format="NDHWC", dilations= 1)
>>> print(result)
ivy.array([[
    [
        [[1.]],[[1.]],[[1.]]
    ],
    [
        [[3.]],[[3.]],[[3.]]
    ]
        ]])
ivy.conv3d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NDHWC', dilations=1, out=None)[source]

Computes a 3-D transpose convolution given 5-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,d,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fd,fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (Union[str, List[int]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • output_shape (Optional[Union[Shape, NativeShape]]) – Shape of the output (Default value = None) (default: None)

  • data_format (str) – “NDHWC” or “NCDHW”. Defaults to “NDHWC”. (default: 'NDHWC')

  • dilations (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the transpose convolution operation.

Functional Examples

With ivy.Array input:

>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3])
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6])
>>> y = ivy.conv3d_transpose(x, filters, 2, 'SAME')
>>> print(y.shape)
(1, 6, 56, 56, 6)

With ivy.NativeArray input:

>>> x = ivy.native_array(
...    ivy.random_normal(mean=0, std=1, shape=[1, 7, 256, 256, 64])
... )
>>> filters = ivy.native_array(
...    ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 64, 32])
... )
>>> y = ivy.conv3d_transpose(x, filters, [1, 1, 1], 'VALID')
>>> print(y.shape)
(1, 9, 258, 258, 32)

With ivy.Container inputs:

>>> x = ivy.Container(a = ivy.random_normal(
...                       mean=0, std=1, shape=[1, 3, 28, 28, 3]
...                       ),
b = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3]))
>>> filters = ivy.Container(c = ivy.random_normal(
...                             mean=0, std=1, shape=[3, 3, 3, 3, 6]
...                             ),
d = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6]))
>>> y = ivy.conv3d_transpose(x, filters, 2, 'SAME')
>>> print(y.shape)
[1, 6, 56, 56, 6]

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.full((1, 6, 6, 6, 1), 2.7)
>>> a =  ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1])
>>> b =  ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1])
>>> filters = ivy.Container(a = a, b = b)
>>> y = ivy.conv3d_transpose(x, filters, 1, 'VALID', dilations=1)
>>> print(y.shape)
[1, 8, 8, 8, 1]

With a mix of ivy.Array, ivy.NativeArray and ivy.Container inputs:

>>> x = ivy.full((1, 6, 6, 6, 1), 1.23)
>>> a =  ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1]))
>>> b =  ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1]))
>>> filters = ivy.Container(a = a, b = b)
>>> y = ivy.conv3d_transpose(x, filters, 1, 'VALID', dilations=1)
>>> print(y.shape)
[1, 8, 8, 8, 1]

Instance Method Examples

Using ivy.Array instance method:

>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3])
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6])
>>> y = x.conv3d_transpose(filters, 2, 'SAME')
>>> print(y.shape)
(1, 6, 56, 56, 6)

Using ivy.Container instance method:

>>> x = ivy.Container(a = ivy.random_normal(
...                            mean=0, std=1, shape=[1, 3, 28, 28, 3]
...                          ),
b = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3]))
>>> filters = ivy.Container(c = ivy.random_normal(
...                                 mean=0, std=1, shape=[3, 3, 3, 3, 3]
...                             ),
d = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 3]))
>>> y = x.conv3d_transpose(filters, 2, "SAME")
>>> print(y.shape)
(1, 6, 56, 56, 3)
ivy.conv_general_dilated(x, filters, strides, padding, /, *, dims=2, data_format='channel_last', feature_group_count=1, x_dilations=1, dilations=1, out=None)[source]

Computes a 1-D, 2-D, and 3-D convolution given 3-D, 4-D and 5-D input x respectively and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,d,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fd,fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (Union[str, List[int]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • dims (int) – Shape of input. (default: 2)

  • data_format (str) – “channel_first” or “channel_last” Defaults to “channel_last” (default: 'channel_last')

  • dilations (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the transpose convolution operation.

ivy.conv_general_transpose(x, filters, strides, padding, /, *, dims=2, output_shape=None, data_format='channel_last', dilations=1, feature_group_count=1, out=None)[source]

Computes a 1-D, 2-D, and 3-D transpose convolution given 3-D, 4-D and 5-D input x respectively and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,d,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fd,fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (Union[str, List[int]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • dims (int) – Shape of input. (default: 2)

  • data_format (str) – “channel_first” or “channel_last” Defaults to “channel_last” (default: 'channel_last')

  • dilations (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the transpose convolution operation.

ivy.deconv_length(dim_size, stride_size, kernel_size, padding, dilation=1)[source]
ivy.depthwise_conv2d(x, filters, strides, padding, /, *, data_format='NHWC', dilations=1, out=None)[source]

Computes a 2-D depthwise convolution given 4-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,h,w,d].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_in]. (d_in must be the same as d from x)

  • strides (Union[int, Tuple[int], Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (Union[str, List[int]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str) – “NHWC” or “NCHW”. Defaults to “NHWC”. (default: 'NHWC')

  • dilations (Optional[Union[int, Tuple[int], Tuple[int, int]]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

  • ret – The result of the convolution operation.

  • Both the description and the type hints above assumes an array input for simplicity,

  • but this function is nestable, and therefore also accepts ivy.Container

  • instances in place of any of the arguments.

Examples

With ivy.Array input:

>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 28, 28, 3]) #NHWC
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3]) #HWI (I == d_in)
>>> y = ivy.depthwise_conv2d(x, filters, (1, 1), 'VALID')
>>> print(y.shape)
(1, 26, 26, 3)
>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 32, 32, 3]) #NHWC
>>> y = ivy.zeros_like(x)
>>> filters = ivy.random_normal(mean=0, std=1, shape=[5, 5, 3]) #HWI (I == d_in)
>>> ivy.depthwise_conv2d(x, filters, [2, 2], 'SAME', out=y)
>>> print(y.shape)
(1, 16, 16, 3)
>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 64, 64, 32]) #NHWC
>>> filters = ivy.random_normal(mean=0, std=1, shape=[4, 4, 32]) #HWI (I == d_in)
>>> ivy.depthwise_conv2d(x, filters, [1, 1], 'VALID', out=x)
>>> print(x.shape)
(1, 61, 61, 32)

With ivy.NativeArray input:

>>> x = ivy.native_array(
...     ivy.random_normal(mean=0, std=1, shape=[1, 7, 7, 64])
... ) #NHWC
>>> filters = ivy.native_array(
...    ivy.random_normal(mean=0, std=1, shape=[3, 3, 64])
... ) #HWI (I == d_in)
>>> y = ivy.depthwise_conv2d(x, filters, [1, 1], 'SAME')
>>> print(y.shape)
(1, 7, 7, 64)

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.eye(6, 6).reshape((1, 6, 6, 1)) #NHWC
>>> a = ivy.array([[1., 1., 1.], [1., -8., 1.], [1., 1., 1.]]).expand_dims(axis=-1)
>>> b = ivy.array([[1., 1., 1.],
...                [1., 1., 1.],
...                [1., 1., 1.]]).expand_dims(axis=-1) / 9.0
>>> filters = ivy.Container(a = a, b = b)
>>> y = ivy.depthwise_conv2d(x, filters, 1, 'VALID', dilations=2)
>>> print(y)
{
    a: ivy.array([[[[-6.],
                    [0.]],
                   [[0.],
                    [-6.]]]]),
    b: ivy.array([[[[0.333],
                    [0.]],
                   [[0.],
                    [0.333]]]])
}

With a mix of ivy.Array, code:ivy.NativeArray and ivy.Container inputs:

>>> x = ivy.eye(6, 6).reshape((1, 6, 6, 1)) #NHWC
>>> y = ivy.native_array(ivy.eye(6, 6).reshape((1, 6, 6, 1)))
>>> inp = ivy.Container(x = x, y = y)
>>> filter = ivy.array([[1., 1., 1.],
...                     [1., -8., 1.],
...                     [1., 1., 1.]]).expand_dims(axis=-1)
>>> y = ivy.depthwise_conv2d(inp, filter, 1, 'VALID', dilations=2)
>>> print(y)
{
    x: ivy.array([[[[-6.],
                    [0.]],
                   [[0.],
                    [-6.]]]]),
    y: ivy.array([[[[-6.],[0.]],[[0.],[-6.]]]])
}
ivy.dropout(x, prob, /, *, scale=True, dtype=None, out=None)[source]

Randomly zeroes some elements of the input tensor with probability p using samples from a Bernoulli distribution.

Parameters
  • x (Union[Array, NativeArray]) – The input array x to perform dropout on.

  • prob (float) – The probability of zeroing out each array element.

  • scale (bool) – Whether to scale the output by 1/(1-prob), default is True. (default: True)

  • dtype (Optional[Dtype]) – (default: None)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – Result array of the linear transformation. [N,∗,out_features]

ivy.dropout1d(x, prob, /, *, training=True, data_format='NWC', out=None)[source]
Randomly zero out entire channels with probability prob using samples from

a Bernoulli distribution and the remaining channels are scaled by (1/1-prob). In this case, dropout1d performs a channel-wise dropout but assumes a channel is a 1D feature map.

Parameters
  • x (Union[Array, NativeArray]) – a 2D or 3D input array. Should have a floating-point data type.

  • prob (float) – probability of a channel to be zero-ed.

  • training (bool) – controls whether dropout1d is performed during training or ignored (default: True) during testing.

  • data_format (str) – “NWC” or “NCW”. Defaults to “NWC”. (default: 'NWC')

  • out (Optional[Array]) – optional output array, for writing the result to. (default: None) It must have a shape that the inputs broadcast to.

Return type

Array

Returns

ret

an array with some channels zero-ed and the rest of channels are

scaled by (1/1-prob).

ivy.fft(x, dim, /, *, norm='backward', n=None, out=None)[source]

Computes the one dimensional discrete Fourier transform given input at least 1-D input x.

Parameters
  • x (Union[Array, NativeArray]) – Input volume […,d_in,…], where d_in indicates the dimension that needs FFT.

  • dim (int) – The dimension along which to take the one dimensional FFT.

  • norm (Optional[str]) – Optional argument, “backward”, “ortho” or “forward”. Defaults to be “backward”. (default: 'backward') “backward” indicates no normalization. “ortho” indicates normalization by $frac{1}{sqrt{n}}$. “forward” indicates normalization by $frac{1}{n}$.

  • n (Optional[Union[Tuple[int], int]]) – Optional argument indicating the sequence length, if given, the input would be (default: None) padded with zero or truncated to length n before performing FFT. Should be a integer greater than 1.

  • out (Optional[Array]) – Optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the FFT operation.

Examples

>>> ivy.fft(np.exp(2j * np.pi * np.arange(8) / 8), 0)
ivy.array([-3.44509285e-16+1.14423775e-17j,  8.00000000e+00-8.11483250e-16j,
        2.33486982e-16+1.22464680e-16j,  0.00000000e+00+1.22464680e-16j,
        9.95799250e-17+2.33486982e-16j,  0.00000000e+00+7.66951701e-17j,
        1.14423775e-17+1.22464680e-16j,  0.00000000e+00+1.22464680e-16j])
>>> ivy.fft(np.exp(2j * np.pi * np.arange(8) / 8), 0, n=16)
ivy.array([-3.44509285e-16+1.14423775e-17j,  1.00000000e+00+5.02733949e+00j,
    8.00000000e+00-8.11483250e-16j,  1.00000000e+00-5.02733949e+00j,
    2.33486982e-16+1.22464680e-16j,  1.00000000e+00-1.49660576e+00j,
    0.00000000e+00+1.22464680e-16j,  1.00000000e+00-6.68178638e-01j,
    9.95799250e-17+2.33486982e-16j,  1.00000000e+00-1.98912367e-01j,
    0.00000000e+00+7.66951701e-17j,  1.00000000e+00+1.98912367e-01j,
    1.14423775e-17+1.22464680e-16j,  1.00000000e+00+6.68178638e-01j,
    0.00000000e+00+1.22464680e-16j,  1.00000000e+00+1.49660576e+00j])
>>> ivy.fft(np.exp(2j * np.pi * np.arange(8) / 8), 0, norm="ortho")
ivy.array([-1.21802426e-16+4.04549134e-18j,  2.82842712e+00-2.86902654e-16j,
    8.25501143e-17+4.32978028e-17j,  0.00000000e+00+4.32978028e-17j,
    3.52068201e-17+8.25501143e-17j,  0.00000000e+00+2.71158374e-17j,
    4.04549134e-18+4.32978028e-17j,  0.00000000e+00+4.32978028e-17j])
ivy.get_x_data_format(dims=2, data_format='channel_first')[source]
ivy.handle_padding(x, strides, filters, padding)[source]
ivy.linear(x, weight, /, *, bias=None, out=None)[source]

Applies a linear transformation to the incoming data: y = x * t(weight) + bias. The operation also supports batching of the weight matrices. This is useful if a batch of different network parameters are to be represented.

Parameters
  • x (Union[Array, NativeArray]) – The input x to compute linear transformation on. [outer_batch_shape,inner_batch_shape,in_features]

  • weight (Union[Array, NativeArray]) – The weight matrix. [outer_batch_shape,out_features,in_features]

  • bias (Optional[Union[Array, NativeArray]]) – The bias vector, default is None. [outer_batch_shape,out_features] (default: None)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

  • ret – Result array of the linear transformation. [outer_batch_shape,inner_batch_shape,out_features]

  • Both the description and the type hints above assumes an array input for simplicity,

  • but this function is nestable, and therefore also accepts ivy.Container

  • instances in place of any of the arguments.

Examples

With ivy.Array input:

>>> x = ivy.array([1., 2., 3.])
>>> w = ivy.array([[1., 0., 0.]])
>>> y = ivy.linear(x, w)
>>> print(y)
ivy.array([1])
>>> x = ivy.array([[0.666, -0.4269, 1.911]])
>>> w = ivy.array([[1., 0., 0.], [0., 0., 1.]])
>>> y = ivy.zeros(2)
>>> ivy.linear(x, w, out=y)
>>> print(y)
ivy.array([[0.666, 1.91 ]])
>>> x = ivy.array([[1.546, 5.234, 6.487],                        [0.157, 5.753, 4.52],                        [5.165, 3.159, 7.101]])
>>> w = ivy.array([[1.545, 2.547, 3.124],                        [5.852, 8.753, 6.963]])
>>> b = ivy.array([-1., 1.])
>>> ivy.linear(x, w, bias=b, out=x)
>>> print(x)
ivy.array([[ 35. , 101. ],
           [ 28. ,  83.7],
           [ 37.2, 108. ]])

With ivy.Container input:

>>> x = ivy.Container(a=ivy.array([[1., 2., 3.],                                        [4., 5., 6.]]),                           b=ivy.array([1.1, 2.2, 3.3]))
>>> w = ivy.Container(a=ivy.array([[1., 2., 3.],                                        [-1., 1., 2.]]),                           b=ivy.array([[0., -1., 1.],                                        [0., 1., 1.]]))
>>> b = ivy.Container(a=ivy.array([1., -1.]), b=ivy.array([1., 1.]))
>>> y = ivy.linear(x, w, bias=b)
>>> print(y)
{
    a: ivy.array([[15., 6.],
                  [33., 12.]]),
    b: ivy.array([2.1, 6.5])
}

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.Container(a=ivy.array([[1.1, 2.2, 3.3],                                        [11., 22., 33.]]),                           b=ivy.array([[1.245, 0.278, 4.105],                                        [7., 13., 17.]]))
>>> w = ivy.array([[1., 2., 3.],                        [4., 5., 6.],                        [7., 8., 9.]])
>>> b = ivy.Container(a=ivy.array([1., 0., -1.]),                           b=ivy.array([1., 1., 0.]))
>>> ivy.linear(x, w, bias=b, out=x)
>>> print(x)
{
    a: ivy.array([[16.4, 35.2, 54.],
                  [155., 352., 549.]]),
    b: ivy.array([[15.1, 32., 47.9],
                  [85., 196., 306.]])
}
ivy.lstm_update(x, init_h, init_c, kernel, recurrent_kernel, /, *, bias=None, recurrent_bias=None)[source]

Perform long-short term memory update by unrolling time dimension of input array.

Parameters
  • x (Union[Array, NativeArray]) – input tensor of LSTM layer [batch_shape, t, in].

  • init_h (Union[Array, NativeArray]) – initial state tensor for the cell output [batch_shape, out].

  • init_c (Union[Array, NativeArray]) – initial state tensor for the cell hidden state [batch_shape, out].

  • kernel (Union[Array, NativeArray]) – weights for cell kernel [in, 4 x out].

  • recurrent_kernel (Union[Array, NativeArray]) – weights for cell recurrent kernel [out, 4 x out].

  • bias (Optional[Union[Array, NativeArray]]) – bias for cell kernel [4 x out]. (Default value = None) (default: None)

  • recurrent_bias (Optional[Union[Array, NativeArray]]) – bias for cell recurrent kernel [4 x out]. (Default value = None) (default: None)

Return type

Tuple[Array, Array]

Returns

ret – hidden state for all timesteps [batch_shape,t,out] and cell state for last timestep [batch_shape,out]

ivy.multi_head_attention(x, scale, num_heads, /, *, context=None, mask=None, to_q_fn=None, to_kv_fn=None, to_out_fn=None, to_q_v=None, to_kv_v=None, to_out_v=None, out=None)[source]

Applies multi-head attention to inputs x.

Parameters
  • x (Union[Array, NativeArray]) – The array to determine the queries from [batch_shape,num_queries,query_dim].

  • scale (float) – The value by which to scale the query-key similarity measure before softmax.

  • num_heads (int) – The number of attention heads to use.

  • context (Optional[Union[Array, NativeArray]]) – The array to determine the keys and values from. Default is None. (default: None) [batch_shape,num_keys,cont_feat_dim].

  • mask (Optional[Union[Array, NativeArray]]) – The mask to apply to the query-key values. Default is None. (default: None) [batch_shape,num_queries,num_keys]

  • to_q_fn (Optional[Callable]) – The function to compute queries from input x, returning queries (default: None) [batch_shape,num_queries,numheads×head_dim]. (Default value = None)

  • to_kv_fn (Optional[Callable]) – The function to compute keys and values from the context. (Default value = None) (default: None)

  • to_out_fn (Optional[Callable]) – The function to compute the output from the scaled dot-product attention. (default: None) (Default value = None)

  • to_q_v – The variables for function to_q_fn. Default is None.

  • to_kv_v – The variables for function to_kv_fn. Default is None.

  • to_out_v – The variables for function to_out_fn. Default is None.

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Union[Array, NativeArray]

Returns

  • ret – The output following application of multi-head attention. [batch_shape,num_queries,out_feat_dim]

  • Both the description and the type hints above assumes an array input for simplicity,

  • but this function is nestable, and therefore also accepts ivy.Container

  • instances in place of any of the arguments.

Examples

With ivy.Array input:

>>> x = ivy.array([[[0.2, 1.],
...                 [2.2, 3.],
...                 [4.4, 5.6]]])
>>> context = ivy.array([[[0.2, 1., 1.1, 4.2],
...                       [2.2, 3., 0.9, 3.6],
...                       [4.4, 5.6, 2.2, 0.4]]])
>>> result = ivy.multi_head_attention(x, 1, 2, context=context)
>>> print(result)
ivy.array([[[1.5678761 , 0.65441847],
...         [2.18969631, 0.40131447],
...         [2.19991851, 0.40000153]]])

With ivy.NativeArray input:

>>> x = ivy.native_array([[[0.2, 1.],
...                        [2.2, 3.],
...                        [4.4, 5.6]]])
>>> context = ivy.native_array([[[0.2, 1., 1.1, 4.2],
...                              [2.2, 3., 0.9, 3.6],
...                              [4.4, 5.6, 2.2, 0.4]]])
>>> result = ivy.multi_head_attention(x, 1, 2, context=context)
>>> print(result)
ivy.array([[[1.5678761 , 0.65441847],
...         [2.18969631, 0.40131447],
...         [2.19991851, 0.40000153]]])

With ivy.Container input:

>>> x = ivy.Container(a=ivy.array([[[0.2, 1.1], [2.2, 3.4], [4.4, 5.6]]]),
...                   b=ivy.array([[[1.4, 0.3], [1.2, 3.9], [0.4, 3.7]]]))
>>> context = ivy.Container(a=ivy.array([[[0.2, 1.8, 1.1, 4.2],
...                                       [2.2, 3.3, 0.9, 3.6],
...                                       [4.4, 5.6, 2.2, 0.4]]]),
...                         b=ivy.array([[[1.4, 0.3, 4.4, 5.6],
...                                       [1.2, 3.9, 4.2, 5.1],
...                                       [0.4, 3.7, 4.3, 5.3]]]))
>>> result = ivy.multi_head_attention(x, 1, 2, context=context)
>>> print(result)
{
    a: ivy.array([[[1.5678761, 0.68589532],
                   [2.18969631, 0.40129396],
                   [2.19991851, 0.40000817]]]),
    b: ivy.array([[[4.31219625, 5.25698996],
                   [4.31022024, 5.16286421],
                   [4.30296469, 5.16460133]]])
}

With a mix of ivy.Container and ivy.Array inputs:

>>> x = ivy.Container(a=ivy.array([[[0.2, 1.1], [2.2, 3.4], [4.4, 5.6]]]),
...                   b=ivy.array([[[1.4, 0.3], [1.2, 3.9], [0.4, 3.7]]]))
>>> context = ivy.array([[[0.2, 1., 1.1, 4.2],
...                       [2.2, 3., 0.9, 3.6],
...                       [4.4, 5.6, 2.2, 0.4]]])
>>> result = ivy.multi_head_attention(x, 1, 2, context=context)
>>> print(result)
{
    a: ivy.array([[[1.5678761, 0.59497029],
                   [2.18969631, 0.40046397],
                   [2.19991851, 0.40000153]]]),
    b: ivy.array([[[2.14009905, 1.81691194],
                   [2.10732293, 0.40012637],
                   [1.73519301, 0.40021262]]])
}

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.array([[[0.2, 1.],
...                 [2.2, 3.],
...                 [4.4, 5.6]]])
>>> context = ivy.Container(a=ivy.array([[[0.2, 1.8, 1.1, 4.2],
...                                       [2.2, 3.3, 0.9, 3.6],
...                                       [4.4, 5.6, 2.2, 0.4]]]),
...                         b=ivy.array([[[1.4, 0.3, 4.4, 5.6],
...                                       [1.2, 3.9, 4.2, 5.1],
...                                       [0.4, 3.7, 4.3, 5.3]]]))
>>> result = ivy.multi_head_attention(x, 1, 2, context=context)
>>> print(result)
{
    a: ivy.array([[[1.5678761, 0.7615059],
                   [2.18969631, 0.40326414],
                   [2.19991851, 0.40000817]]]),
    b: ivy.array([[[4.30141067, 5.19610119],
                   [4.32028484, 5.1708746],
                   [4.34100914, 5.14920235]]])
}

With ivy.Array inputs and ivy.Array mask:

>>> x = ivy.array([[[0.2, 1.],
...                 [2.2, 3.],
...                 [4.4, 5.6]]])
>>> context = ivy.array([[[0.2, 1., 1.1, 4.2],
...                       [2.2, 3., 0.9, 3.6],
...                       [4.4, 5.6, 2.2, 0.4]]])
>>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]])
>>> result = ivy.multi_head_attention(x, 1, 2, context=context, mask=mask)
>>> print(result)
ivy.array([[[1.40000009, 2.73333335],
...         [1.40000009, 2.73333335],
...         [1.40000009, 2.73333335]]])

With ivy.Array inputs and lambda to_q_fn and to_kv_fn functions specified:

>>> x = ivy.array([[[0.2, 1.],
...                 [2.2, 3.],
...                 [4.4, 5.6]]])
>>> context = ivy.array([[[0.2, 1., 1.1, 4.2],
...                       [2.2, 3., 0.9, 3.6],
...                       [4.4, 5.6, 2.2, 0.4]]])
>>> to_q_fn = lambda n, v: n
>>> to_kv_fn = lambda n, v: ivy.split(n, num_or_size_splits=2, axis=-1)
>>> result = layers.multi_head_attention(x, 1, 2, context=context,
...                                      to_q_fn=to_q_fn, to_kv_fn=to_kv_fn)
>>> print(result)
ivy.array([[[1.5678761 , 0.65441847],
...         [2.18969631, 0.40131447],
...         [2.19991851, 0.40000153]]])
ivy.scaled_dot_product_attention(q, k, v, scale, /, *, mask=None, out=None)[source]

Applies scaled dot product attention to inputs x using optional mask.

Parameters
  • q (Union[Array, NativeArray]) – The queries input array. The shape of queries input array should be in [batch_shape,num_queries,feat_dim]. The queries input array should have the same size as keys and values.

  • k (Union[Array, NativeArray]) – The keys input array. The shape of keys input array should be in [batch_shape,num_keys,feat_dim]. The keys input array should have the same size as queries and values.

  • v (Union[Array, NativeArray]) – The values input array. The shape of values input should be in [batch_shape,num_keys,feat_dim]. The values input array should have the same size as queries and keys.

  • scale (float) – The scale float value. The scale float value is used to scale the query-key pairs before softmax.

  • mask (Optional[Union[Array, NativeArray]]) – The mask input array. The mask to apply to the query-key values. Default is (default: None) None. The shape of mask input should be in [batch_shape,num_queries,num_keys].

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

  • ret – The output following application of scaled dot-product attention. The output array is the weighted sum produced by the attention score and value. The shape of output array is [batch_shape,num_queries,feat_dim] .

  • Both the description and the type hints above assumes an array input for simplicity,

  • but this function is nestable, and therefore also accepts ivy.Container

  • instances in place of any of the arguments.

Functional Examples

With ivy.Array input:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
ivy.array([[[4.04,5.03],[4.3,5.3],[4.3,5.3]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0],[0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
ivy.array([[[2.3, 3.23],[2.3, 3.23],[2.3, 3.23]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out)
>>> print(out)
ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])

With ivy.NativeArray input:

>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
ivy.array([[[4.04,5.03],[4.3,5.3],[4.3,5.3]]])
>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> mask = ivy.native_array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0],[0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
ivy.array([[[2.3, 3.23],[2.3, 3.23],[2.3, 3.23]]])
>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out)
>>> print(out)
ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])

With ivy.Container input:

>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]),
...                   b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]),
...                   b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]]))
>>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.], [4.4, 5.6]]]),
...                   b=ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
{a:ivy.array([[[4.27,5.4],[4.4,5.6],[4.4,5.6]]]),b:ivy.array([[[4.35,5.54],[4.4,5.6],[4.4,5.6]]])}
>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]),
...                   b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]),
...                   b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]]))
>>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.], [4.4, 5.6]]]),
...                   b=ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> mask =
... ivy.Container(a=ivy.array([[[1.0, 1.0, 1.0],[1.0, 1.0, 1.0],[1.0, 1.0, 1.0]]]),
...               b=ivy.array([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0,1.0]]]))
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
{
    a: ivy.array([[[4.27, 5.4],
                   [4.4, 5.6],
                   [4.4, 5.6]]]),
    b: ivy.array([[[4.35, 5.54],
                   [4.4, 5.6],
                   [4.4, 5.6]]])
}

With a mix of ivy.Array and ivy.NativeArray inputs:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
ivy.array([[
        [4.04, 5.03],
        [4.3 , 5.3 ],
        [4.3 , 5.3 ]
    ]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out)
>>> print(out)
ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])

With a mix of ivy.Array and ivy.Container inputs:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]),
...                   b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]]))
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
{
    a: ivy.array([[[4.14, 5.13],
                   [4.3, 5.3],
                   [4.3, 5.3]]]),
    b: ivy.array([[[4.09, 5.08],
                   [4.3, 5.3],
                   [4.3, 5.3]]])
}

Instance Method Examples

With ivy.Array input:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
ivy.array([[[2.3, 3.23],[2.3, 3.23],[2.3, 3.23]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask, out=out)
>>> print(out)
ivy.array([[[2.3, 3.23],[2.3, 3.23],[2.3, 3.23]]])

With ivy.Container input:

>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]),
...                   b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3],[4.4, 5.6]]]),
...                   b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]]))
>>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.],[4.4, 5.6]]]),
...                   b=ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]))
>>> mask =
... ivy.Container(a=ivy.array([[[1.0, 1.0, 1.0],[1.0, 1.0, 1.0],[1.0, 1.0,1.0]]]),
...               b=ivy.array([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0,1.0]]]))
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
{
    a: ivy.array([[[4.27, 5.4],
                [4.4, 5.6],
                [4.4, 5.6]]]),
    b: ivy.array([[[4.35, 5.54],
                [4.4, 5.6],
                [4.4, 5.6]]])
}

With a mix of ivy.Array and ivy.Container inputs:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3],[4.4, 5.6]]]),
...                   b=ivy.array([[[3.2, 1.], [2.2, 3.6],[4.0, 5.6]]]))
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> mask = ivy.native_array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
{
    a: ivy.array([[[4.14, 5.13],
                [4.3, 5.3],
                [4.3, 5.3]]]),
    b: ivy.array([[[4.09, 5.08],
                [4.3, 5.3],
                [4.3, 5.3]]])
}

This should have hopefully given you an overview of the layers submodule,If you have any questions, please feel free to reach out on our discord in the layers channel or in the layers forum!