Layers

Collection of Ivy neural network layers in functional form.

ivy.conv1d(x, filters, strides, padding, /, *, data_format='NWC', dilations=1, out=None)[source]

Computes a 1-D convolution given 3-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fw,d_in,d_out].

  • strides (int) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str) – NWC” or “NCW”. Defaults to “NWC”. (default: 'NWC')

  • dilations (int) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the convolution operation.

Examples

With ivy.Array input:

>>> x = ivy.asarray([[[0.], [3.], [0.]]]) #NWC
>>> filters = ivy.array([[[0.]], [[1.]], [[0.]]]) #WIO
>>> result = ivy.conv1d(x, filters, (1,), 'SAME', data_format='NWC',dilations= (1,))
>>> print(result)
ivy.array([[[0.], [3.], [0.]]])

With ivy.NativeArray input:

>>> x = ivy.native_array([[[1., 3.], [2., 4.], [5., 7]]])
>>> filters = ivy.native_array([[[0., 1.], [1., 0.]]])
>>> result = ivy.conv1d(x, filters, (2,),'VALID')
>>> print(result)
ivy.array([[[3., 1.],                 [7., 5.]]])

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.Container(a= ivy.array([[[1.2, 3.1, 4.8], [5.9, 2.2, 3.3],                                       [10.8, 7.6, 4.9], [6.1, 2.2, 9.5]]]),                           b= ivy.array([[[8.8, 7.7, 6.6], [1.1, 2.2, 3.5]]]))
>>> filters = ivy.array([[[1., 0., 1.], [0., 1., 0.], [1., 1., 0.]]])
>>> result  = ivy.conv1d(x, filters, 3, 'VALID')
>>> print(result)
{
        a: ivy.array([[[6., 7.9, 1.2],                          [15.6, 11.7, 6.1]]]),             b: ivy.array([[[15.4, 14.3, 8.8]]])
}
ivy.conv1d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NWC', dilations=1, out=None)[source]

Computes a 1-D transpose convolution given 3-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fw,d_in,d_out].

  • strides (int) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • output_shape (Optional[Union[Shape, NativeShape]]) – Shape of the output (Default value = None) (default: None)

  • data_format (str) – NWC” or “NCW”. Defaults to “NWC”. (default: 'NWC')

  • dilations (int) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Union[Array, NativeArray]]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Union[Array, NativeArray]

Returns

ret – The result of the transpose convolution operation.

ivy.conv2d(x, filters, strides, padding, /, *, data_format='NHWC', dilations=1, out=None)[source]

Computes a 2-D convolution given 4-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int], Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str) – NHWC” or “NCHW”. Defaults to “NHWC”. (default: 'NHWC')

  • dilations (Optional[Union[int, Tuple[int], Tuple[int, int]]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

  • ret – The result of the convolution operation.

  • Both the description and the type hints above assumes an array input for simplicity,

  • but this function is nestable, and therefore also accepts ivy.Container

  • instances in place of any of the arguments.

Functional Examples

With ivy.Array input:

>>> x = ivy.array([[[[1.], [2.0],[3.]],                       [[1.], [2.0],[3.]],                       [[1.], [2.0],[3.]]]]) #NHWC
>>> filters = ivy.array([[[[0.]],[[1.]],[[0.]]],                              [[[0.]],[[1.]], [[0.]]],                              [[[0.]],[[1.]], [[0.]]]]) #HWIO
>>> result = ivy.conv2d(x, filters, (1,), 'SAME', data_format='NHWC',     dilations= (1,))
>>> print(result)
ivy.array([[
          [[2.],[4.],[6.]],
          [[3.],[6.],[9.]],
          [[2.],[4.],[6.]]
          ]])

With ivy.NativeArray input:

>>> x = ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[1, 32, 32, 3]))
>>> filters = ivy.native_array(ivy.random_normal(mean=0, std=1,                                    shape=[3, 5, 3, 5])) #HWIO
>>> result = ivy.conv2d(x, filters, [2, 1], 'VALID')         #non-square filter with unequal stride and valid padding
>>> print(result.shape)
(1, 15, 28, 5)

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.Container(a = ivy.eye(3, 3).view(1, 3, 3, 1),                           b = ivy.eye(5, 5).view(1, 5, 5, 1))
>>> filters = ivy.array([[2, 0, 1],                              [1, 3, 1],                              [0, 1, 1]]).unsqueeze(-1).unsqueeze(-1).float()
>>> result = ivy.conv2d(x, filters, (2,), 'SAME')
>>> print(result)
{
    a:ivy.array([[[[4.],[0.]],[[1.],[5.]]]]),
    b:ivy.array([[[[4.],[0.],[0.]],[[1.],[6.],[0.]],[[0.],[1.],[5.]]]])
}
ivy.conv2d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NHWC', dilations=1, out=None)[source]

Computes a 2-D transpose convolution given 4-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int], Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • output_shape (Optional[Union[Shape, NativeShape]]) – Shape of the output (Default value = None) (default: None)

  • data_format (str) – NHWC” or “NCHW”. Defaults to “NHWC”. (default: 'NHWC')

  • dilations (Union[int, Tuple[int], Tuple[int, int]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Union[Array, NativeArray]]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Union[Array, NativeArray]

Returns

ret – The result of the transpose convolution operation.

ivy.conv3d(x, filters, strides, padding, /, *, data_format='NDHWC', dilations=1, out=None)[source]

Computes a 3-D convolution given 5-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input volume [batch_size,d,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fd,fh,fw,d_in,d_out].

  • strides (int) – The stride of the sliding window for each dimension of input.

  • padding (str) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str) – NDHWC” or “NCDHW”. Defaults to “NDHWC”. (default: 'NDHWC')

  • dilations (int) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the convolution operation.

Examples

>>> x1 = [[[1.],[2.]],[[1.],[2.]],[[1.],[2.]]]
>>> x2 = [[[3.],[4.]],[[3.],[4.]],[[3.],[4.]]]
>>> x = ivy.array([[x1,x2]]) #NDHWC
>>> filters = ivy.array([[[[[1]],[[0.]]]]]) #DHWIO
>>> result = ivy.conv3d( x, filters, 1, 'VALID',data_format="NDHWC", dilations= 1)
>>> print(result)
ivy.array([[
    [
        [[1.]],[[1.]],[[1.]]
    ],
    [
        [[3.]],[[3.]],[[3.]]
    ]
        ]])
ivy.conv3d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NDHWC', dilations=1, out=None)[source]

Computes a 3-D transpose convolution given 5-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,d,h,w,d_in].

  • filters (Union[Array, NativeArray]) – Convolution filters [fd,fh,fw,d_in,d_out].

  • strides (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (Union[str, List[int]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • output_shape (Optional[Union[Shape, NativeShape]]) – Shape of the output (Default value = None) (default: None)

  • data_format (str) – “NDHWC” or “NCDHW”. Defaults to “NDHWC”. (default: 'NDHWC')

  • dilations (Union[int, Tuple[int], Tuple[int, int], Tuple[int, int, int]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – The result of the transpose convolution operation.

Functional Examples

With ivy.Array input:

>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3])
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6])
>>> y = ivy.conv3d_transpose(x, filters, 2, 'SAME')
>>> print(y.shape)
(1, 6, 56, 56, 6)

With ivy.NativeArray input:

>>> x = ivy.native_array(         ivy.random_normal(mean=0, std=1, shape=[1, 7, 256, 256, 64])     )
>>> filters = ivy.native_array(         ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 64, 32])     )
>>> y = ivy.conv3d_transpose(x, filters, [1, 1, 1], 'VALID')
>>> print(y.shape)
(1, 9, 258, 258, 32)

With :code: ivy.Container inputs:

>>> x = ivy.Container(a = ivy.random_normal(                                 mean=0, std=1, shape=[1, 3, 28, 28, 3]                                ),     b = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3]))
>>> filters = ivy.Container(c = ivy.random_normal(                                         mean=0, std=1, shape=[3, 3, 3, 3, 6]                                     ),     d = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6]))
>>> y = ivy.conv3d_transpose(x, filters, 2, 'SAME')
>>> print(y.shape)
[1, 6, 56, 56, 6]

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.full((1, 6, 6, 6, 1), 2.7)
>>> a =  ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1])
>>> b =  ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1])
>>> filters = ivy.Container(a = a, b = b)
>>> y = ivy.conv3d_transpose(x, filters, 1, 'VALID', dilations=1)
>>> print(y.shape)
[1, 8, 8, 8, 1]

With a mix of ivy.Array, ivy.NativeArray and ivy.Container inputs:

>>> x = ivy.full((1, 6, 6, 6, 1), 1.23)
>>> a =  ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1]))
>>> b =  ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1]))
>>> filters = ivy.Container(a = a, b = b)
>>> y = ivy.conv3d_transpose(x, filters, 1, 'VALID', dilations=1)
>>> print(y.shape)
[1, 8, 8, 8, 1]

Instance Method Examples

Using ivy.Array instance method:

>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3])
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6])
>>> y = x.conv3d_transpose(filters, 2, 'SAME')
>>> print(y.shape)
(1, 6, 56, 56, 6)

Using ivy.Container instance method:

>>> x = ivy.Container(a = ivy.random_normal(                                 mean=0, std=1, shape=[1, 3, 28, 28, 3]                               ),     b = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3]))
>>> filters = ivy.Container(c = ivy.random_normal(                                         mean=0, std=1, shape=[3, 3, 3, 3, 3]                                     ),     d = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 3]))
>>> y = x.conv3d_transpose(filters, 2, "SAME")
>>> print(y.shape)
(1, 6, 56, 56, 3)
ivy.deconv_length(dim_size, stride_size, kernel_size, padding, dilation=1)[source]
ivy.depthwise_conv2d(x, filters, strides, padding, /, *, data_format='NHWC', dilations=1, out=None)[source]

Computes a 2-D depthwise convolution given 4-D input x and filters arrays.

Parameters
  • x (Union[Array, NativeArray]) – Input image [batch_size,h,w,d].

  • filters (Union[Array, NativeArray]) – Convolution filters [fh,fw,d_in]. (d_in must be the same as d from x)

  • strides (Union[int, Tuple[int], Tuple[int, int]]) – The stride of the sliding window for each dimension of input.

  • padding (Union[str, List[int]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.

  • data_format (str) – “NHWC” or “NCHW”. Defaults to “NHWC”. (default: 'NHWC')

  • dilations (Optional[Union[int, Tuple[int], Tuple[int, int]]]) – The dilation factor for each dimension of input. (Default value = 1) (default: 1)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

  • ret – The result of the convolution operation.

  • Both the description and the type hints above assumes an array input for simplicity,

  • but this function is nestable, and therefore also accepts ivy.Container

  • instances in place of any of the arguments.

Examples

With ivy.Array input:

>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 28, 28, 3]) #NHWC
>>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3]) #HWI (I == d_in)
>>> y = ivy.depthwise_conv2d(x, filters, (1, 1), 'VALID')
>>> print(y.shape)
(1, 26, 26, 3)
>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 32, 32, 3]) #NHWC
>>> y = ivy.zeros_like(x)
>>> filters = ivy.random_normal(mean=0, std=1, shape=[5, 5, 3]) #HWI (I == d_in)
>>> ivy.depthwise_conv2d(x, filters, [2, 2], 'SAME', out=y)
>>> print(y.shape)
(1, 16, 16, 3)
>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 64, 64, 32]) #NHWC
>>> filters = ivy.random_normal(mean=0, std=1, shape=[4, 4, 32]) #HWI (I == d_in)
>>> ivy.depthwise_conv2d(x, filters, [1, 1], 'VALID', out=x)
>>> print(x.shape)
(1, 61, 61, 32)

With ivy.NativeArray input:

>>> x = ivy.native_array(         ivy.random_normal(mean=0, std=1, shape=[1, 7, 7, 64])     ) #NHWC
>>> filters = ivy.native_array(         ivy.random_normal(mean=0, std=1, shape=[3, 3, 64])     ) #HWI (I == d_in)
>>> y = ivy.depthwise_conv2d(x, filters, [1, 1], 'SAME')
>>> print(y.shape)
(1, 7, 7, 64)

With a mix of ivy.Array and ivy.Container inputs:

>>> x = ivy.eye(6, 6).reshape((1, 6, 6, 1)) #NHWC
>>> a = ivy.array([[1., 1., 1.], [1., -8., 1.], [1., 1., 1.]]).expand_dims(-1)
>>> b = ivy.array([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.]]).expand_dims(-1) / 9.0
>>> filters = ivy.Container(a = a, b = b)
>>> y = ivy.depthwise_conv2d(x, filters, 1, 'VALID', dilations=2)
>>> print(y)
{
    a: ivy.array([[[[-6.],
                    [0.]],
                   [[0.],
                    [-6.]]]]),
    b: ivy.array([[[[0.333],
                    [0.]],
                   [[0.],
                    [0.333]]]])
}

With a mix of ivy.Array, code:ivy.NativeArray and ivy.Container inputs:

>>> x = ivy.eye(6, 6).reshape((1, 6, 6, 1)) #NHWC
>>> y = ivy.native_array(ivy.eye(6, 6).reshape((1, 6, 6, 1)))
>>> inp = ivy.Container(x = x, y = y)
>>> filter = ivy.array([[1., 1., 1.], [1., -8., 1.], [1., 1., 1.]]).expand_dims(-1)
>>> y = ivy.depthwise_conv2d(inp, filter, 1, 'VALID', dilations=2)
>>> print(y)
{
    x: ivy.array([[[[-6.],
                    [0.]],
                   [[0.],
                    [-6.]]]]),
    y: ivy.array([[[[-6.],[0.]],[[0.],[-6.]]]])
}
ivy.dropout(x, prob, /, *, scale=True, dtype=None, out=None)[source]

Randomly zeroes some elements of the input tensor with probability p using samples from a Bernoulli distribution.

Parameters
  • x (Union[Array, NativeArray]) – The input array x to perform dropout on.

  • prob (float) – The probability of zeroing out each array element.

  • scale (bool) – Whether to scale the output by 1/(1-prob), default is True. (default: True)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – Result array of the linear transformation. [N,∗,out_features]

ivy.handle_padding(x, strides, filters, padding)[source]
ivy.linear(x, weight, /, *, bias=None, out=None)[source]

Applies a linear transformation to the incoming data: y = x * t(weight) + bias. The operation also supports batching of the weight matrices. This is useful if a batch of different network parameters are to be represented.

Parameters
  • x (Union[Array, NativeArray]) – The input x compute linear transformation on. [outer_batch_shape,inner_batch_shape,in_features]

  • weight (Union[Array, NativeArray]) – The weight matrix. [outer_batch_shape,out_features,in_features]

  • bias (Optional[Union[Array, NativeArray]]) – The bias vector, default is None. [outer_batch_shape,out_features] (default: None)

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Array

Returns

ret – Result array of the linear transformation. [outer_batch_shape,inner_batch_shape,out_features]

ivy.lstm_update(x, init_h, init_c, kernel, recurrent_kernel, /, *, bias=None, recurrent_bias=None)[source]

Perform long-short term memory update by unrolling time dimension of input array.

Parameters
  • x (Union[Array, NativeArray]) – input tensor of LSTM layer [batch_shape, t, in].

  • init_h (Union[Array, NativeArray]) – initial state tensor for the cell output [batch_shape, out].

  • init_c (Union[Array, NativeArray]) – initial state tensor for the cell hidden state [batch_shape, out].

  • kernel (Union[Array, NativeArray]) – weights for cell kernel [in, 4 x out].

  • recurrent_kernel (Union[Array, NativeArray]) – weights for cell recurrent kernel [out, 4 x out].

  • bias (Optional[Union[Array, NativeArray]]) – bias for cell kernel [4 x out]. (Default value = None) (default: None)

  • recurrent_bias (Optional[Union[Array, NativeArray]]) – bias for cell recurrent kernel [4 x out]. (Default value = None) (default: None)

Return type

Tuple[Any, Union[Array, NativeArray, Any]]

Returns

ret – hidden state for all timesteps [batch_shape,t,out] and cell state for last timestep [batch_shape,out]

ivy.multi_head_attention(x, scale, num_heads, /, *, context=None, mask=None, to_q_fn=None, to_kv_fn=None, to_out_fn=None, to_q_v=None, to_kv_v=None, to_out_v=None, out=None)[source]

Applies multi-head attention to inputs x.

Parameters
  • x (Union[Array, NativeArray]) – The array to determine the queries from [batch_shape,num_queries,x_feat_dim].

  • scale – The value by which to scale the query-key similarity measure before softmax.

  • num_heads – The number of attention heads to use.

  • context (Optional[Union[Array, NativeArray]]) – The array to determine the keys and values from. Default is None. (default: None) [batch_shape,num_keys,cont_feat_dim].

  • mask (Optional[Union[Array, NativeArray]]) – The mask to apply to the query-key values. Default is None. (default: None) [batch_shape,num_queries,num_keys]

  • to_q_fn (Optional[Callable]) – The function to compute queries from input x, returning queries (default: None) [batch_shape,num_queries,numheads×feat_dim]. (Default value = None)

  • to_kv_fn (Optional[Callable]) – The function to compute keys and values from the context. (Default value = None) (default: None)

  • to_out_fn (Optional[Callable]) – The function to compute the output from the scaled dot-product attention. (default: None) (Default value = None)

  • to_q_v – The variables for function to_q_fn. Default is None.

  • to_kv_v – The variables for function to_kv_fn. Default is None.

  • to_out_v – The variables for function to_out_fn. Default is None.

  • out (Optional[Union[Array, NativeArray]]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Union[Array, NativeArray]

Returns

ret – The output following application of multi-head attention. [batch_shape,num_queries,out_feat_dim]

ivy.scaled_dot_product_attention(q, k, v, scale, /, *, mask=None, out=None)[source]

Applies scaled dot product attention to inputs x using optional mask.

Parameters
  • q (Union[Array, NativeArray]) – The queries input array. The shape of queries input array should be in [batch_shape,num_queries,feat_dim]. The queries input array should have the same size as keys and values.

  • k (Union[Array, NativeArray]) – The keys input array. The shape of keys input array should be in [batch_shape,num_keys,feat_dim]. The keys input array should have the same size as queries and values.

  • v (Union[Array, NativeArray]) – The values input array. The shape of values input should be in [batch_shape,num_keys,feat_dim]. The values input array should have the same size as queries and keys.

  • scale (float) – The scale float value. The scale float value is used to scale the query-key pairs before softmax.

  • mask (Optional[Union[Array, NativeArray]]) – The mask input array. The mask to apply to the query-key values. Default is (default: None) None. The shape of mask input should be in [batch_shape,num_queries,num_keys].

  • out (Optional[Array]) – optional output array, for writing the result to. It must have a shape that the (default: None) inputs broadcast to.

Return type

Union[Array, NativeArray]

Returns

  • ret – The output following application of scaled dot-product attention. The output array is the weighted sum produced by the attention score and value. The shape of output array is [batch_shape,num_queries,feat_dim] .

  • Both the description and the type hints above assumes an array input for simplicity,

  • but this function is nestable, and therefore also accepts ivy.Container

  • instances in place of any of the arguments.

Functional Examples

With ivy.Array input:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
ivy.array([[[4.04,5.03],[4.3,5.3],[4.3,5.3]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0],[0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
ivy.array([[[nan,nan],[nan,nan],[nan,nan]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out)
>>> print(out)
ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])

With ivy.NativeArray input:

>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
ivy.array([[[4.04,5.03],[4.3,5.3],[4.3,5.3]]])
>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> mask = ivy.native_array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0],[0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
ivy.array([[[nan,nan],[nan,nan],[nan,nan]]])
>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out)
>>> print(out)
ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])

With ivy.Container input:

>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]),    b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]),    b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]]))
>>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.], [4.4, 5.6]]]),    b=ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
{a:ivy.array([[[4.27,5.4],[4.4,5.6],[4.4,5.6]]]),b:ivy.array([[[4.35,5.54],[4.4,5.6],[4.4,5.6]]])}
>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]),    b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]),    b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]]))
>>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.], [4.4, 5.6]]]),    b=ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> mask =     ivy.Container(a=ivy.array([[[1.0, 1.0, 1.0],[1.0, 1.0, 1.0],[1.0, 1.0, 1.0]]]),    b=ivy.array([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]]))
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
{
    a: ivy.array([[[4.27, 5.4],
                [4.4, 5.6],
                [4.4, 5.6]]]),
    b: ivy.array([[[4.35, 5.54],
                [4.4, 5.6],
                [4.4, 5.6]]])
}

With a mix of ivy.Array and ivy.NativeArray inputs:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
ivy.array([[
        [4.04, 5.03],
        [4.3 , 5.3 ],
        [4.3 , 5.3 ]
    ]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out)
>>> print(out)
ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])

With a mix of ivy.Array and ivy.Container inputs:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]),    b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]]))
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
{
    a: ivy.array([[[4.14, 5.13],
                [4.3, 5.3],
                [4.3, 5.3]]]),
    b: ivy.array([[[4.09, 5.08],
                [4.3, 5.3],
                [4.3, 5.3]]])
}

Instance Method Examples

With ivy.Array input:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
ivy.array([[[nan,nan],[nan,nan],[nan,nan]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])
>>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]])
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]])
>>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]])
>>> out = ivy.zeros(shape=(1, 3, 2))
>>> ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask, out=out)
>>> print(out)
ivy.array([[[nan, nan],[nan, nan],[nan, nan]]])

With ivy.Container input:

>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]),    b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]]))
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3],[4.4, 5.6]]]),    b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]]))
>>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.],[4.4, 5.6]]]),    b=ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]))
>>> mask =     ivy.Container(a=ivy.array([[[1.0, 1.0, 1.0],[1.0, 1.0, 1.0],[1.0, 1.0, 1.0]]]),    b=ivy.array([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]]))
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask)
>>> print(result)
{
    a: ivy.array([[[4.27, 5.4],
                [4.4, 5.6],
                [4.4, 5.6]]]),
    b: ivy.array([[[4.35, 5.54],
                [4.4, 5.6],
                [4.4, 5.6]]])
}

With a mix of ivy.Array and ivy.Container inputs:

>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])
>>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3],[4.4, 5.6]]]),    b=ivy.array([[[3.2, 1.], [2.2, 3.6],[4.0, 5.6]]]))
>>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]])
>>> mask = ivy.native_array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]])
>>> result = ivy.scaled_dot_product_attention(q, k, v, 1)
>>> print(result)
{
    a: ivy.array([[[4.14, 5.13],
                [4.3, 5.3],
                [4.3, 5.3]]]),
    b: ivy.array([[[4.09, 5.08],
                [4.3, 5.3],
                [4.3, 5.3]]])
}