Layers
Collection of Ivy neural network layers in functional form.
- ivy.conv1d(x, filters, strides, padding, /, *, data_format='NWC', dilations=1, out=None)[source]
Computes a 1-D convolution given 3-D input x and filters arrays.
- Parameters
x (
Union
[Array
,NativeArray
]) – Input image [batch_size,w,d_in].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fw,d_in,d_out].strides (
int
) – The stride of the sliding window for each dimension of input.padding (
str
) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.data_format (
str
) – NWC” or “NCW”. Defaults to “NWC”. (default:'NWC'
)dilations (
int
) – The dilation factor for each dimension of input. (Default value = 1) (default:1
)out (
Optional
[Array
]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
- Returns
ret – The result of the convolution operation.
Examples
With
ivy.Array
input:>>> x = ivy.asarray([[[0.], [3.], [0.]]]) #NWC >>> filters = ivy.array([[[0.]], [[1.]], [[0.]]]) #WIO >>> result = ivy.conv1d(x, filters, (1,), 'SAME', data_format='NWC',dilations= (1,)) >>> print(result) ivy.array([[[0.], [3.], [0.]]])
With
ivy.NativeArray
input:>>> x = ivy.native_array([[[1., 3.], [2., 4.], [5., 7]]]) >>> filters = ivy.native_array([[[0., 1.], [1., 0.]]]) >>> result = ivy.conv1d(x, filters, (2,),'VALID') >>> print(result) ivy.array([[[3., 1.], [7., 5.]]])
With a mix of
ivy.Array
andivy.Container
inputs:>>> x = ivy.Container(a= ivy.array([[[1.2, 3.1, 4.8], [5.9, 2.2, 3.3], [10.8, 7.6, 4.9], [6.1, 2.2, 9.5]]]), b= ivy.array([[[8.8, 7.7, 6.6], [1.1, 2.2, 3.5]]])) >>> filters = ivy.array([[[1., 0., 1.], [0., 1., 0.], [1., 1., 0.]]]) >>> result = ivy.conv1d(x, filters, 3, 'VALID') >>> print(result) { a: ivy.array([[[6., 7.9, 1.2], [15.6, 11.7, 6.1]]]), b: ivy.array([[[15.4, 14.3, 8.8]]]) }
- ivy.conv1d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NWC', dilations=1, out=None)[source]
Computes a 1-D transpose convolution given 3-D input x and filters arrays.
- Parameters
x (
Union
[Array
,NativeArray
]) – Input image [batch_size,w,d_in].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fw,d_in,d_out].strides (
int
) – The stride of the sliding window for each dimension of input.padding (
str
) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.output_shape (
Optional
[Union
[Shape
,NativeShape
]]) – Shape of the output (Default value = None) (default:None
)data_format (
str
) – NWC” or “NCW”. Defaults to “NWC”. (default:'NWC'
)dilations (
int
) – The dilation factor for each dimension of input. (Default value = 1) (default:1
)out (
Optional
[Union
[Array
,NativeArray
]]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
Union
[Array
,NativeArray
]- Returns
ret – The result of the transpose convolution operation.
- ivy.conv2d(x, filters, strides, padding, /, *, data_format='NHWC', dilations=1, out=None)[source]
Computes a 2-D convolution given 4-D input x and filters arrays.
- Parameters
x (
Union
[Array
,NativeArray
]) – Input image [batch_size,h,w,d_in].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fh,fw,d_in,d_out].strides (
Union
[int
,Tuple
[int
],Tuple
[int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
str
) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.data_format (
str
) – NHWC” or “NCHW”. Defaults to “NHWC”. (default:'NHWC'
)dilations (
Optional
[Union
[int
,Tuple
[int
],Tuple
[int
,int
]]]) – The dilation factor for each dimension of input. (Default value = 1) (default:1
)out (
Optional
[Array
]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
- Returns
ret – The result of the convolution operation.
Both the description and the type hints above assumes an array input for simplicity,
but this function is nestable, and therefore also accepts
ivy.Container
instances in place of any of the arguments.
Functional Examples
With
ivy.Array
input:>>> x = ivy.array([[[[1.], [2.0],[3.]], [[1.], [2.0],[3.]], [[1.], [2.0],[3.]]]]) #NHWC
>>> filters = ivy.array([[[[0.]],[[1.]],[[0.]]], [[[0.]],[[1.]], [[0.]]], [[[0.]],[[1.]], [[0.]]]]) #HWIO >>> result = ivy.conv2d(x, filters, (1,), 'SAME', data_format='NHWC', dilations= (1,)) >>> print(result) ivy.array([[ [[2.],[4.],[6.]], [[3.],[6.],[9.]], [[2.],[4.],[6.]] ]])
With
ivy.NativeArray
input:>>> x = ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[1, 32, 32, 3])) >>> filters = ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[3, 5, 3, 5])) #HWIO >>> result = ivy.conv2d(x, filters, [2, 1], 'VALID') #non-square filter with unequal stride and valid padding >>> print(result.shape) (1, 15, 28, 5)
With a mix of
ivy.Array
andivy.Container
inputs:>>> x = ivy.Container(a = ivy.eye(3, 3).view(1, 3, 3, 1), b = ivy.eye(5, 5).view(1, 5, 5, 1)) >>> filters = ivy.array([[2, 0, 1], [1, 3, 1], [0, 1, 1]]).unsqueeze(-1).unsqueeze(-1).float() >>> result = ivy.conv2d(x, filters, (2,), 'SAME') >>> print(result) { a:ivy.array([[[[4.],[0.]],[[1.],[5.]]]]), b:ivy.array([[[[4.],[0.],[0.]],[[1.],[6.],[0.]],[[0.],[1.],[5.]]]]) }
- ivy.conv2d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NHWC', dilations=1, out=None)[source]
Computes a 2-D transpose convolution given 4-D input x and filters arrays.
- Parameters
x (
Union
[Array
,NativeArray
]) – Input image [batch_size,h,w,d_in].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fh,fw,d_in,d_out].strides (
Union
[int
,Tuple
[int
],Tuple
[int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
str
) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.output_shape (
Optional
[Union
[Shape
,NativeShape
]]) – Shape of the output (Default value = None) (default:None
)data_format (
str
) – NHWC” or “NCHW”. Defaults to “NHWC”. (default:'NHWC'
)dilations (
Union
[int
,Tuple
[int
],Tuple
[int
,int
]]) – The dilation factor for each dimension of input. (Default value = 1) (default:1
)out (
Optional
[Union
[Array
,NativeArray
]]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
Union
[Array
,NativeArray
]- Returns
ret – The result of the transpose convolution operation.
- ivy.conv3d(x, filters, strides, padding, /, *, data_format='NDHWC', dilations=1, out=None)[source]
Computes a 3-D convolution given 5-D input x and filters arrays.
- Parameters
x (
Union
[Array
,NativeArray
]) – Input volume [batch_size,d,h,w,d_in].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fd,fh,fw,d_in,d_out].strides (
int
) – The stride of the sliding window for each dimension of input.padding (
str
) – SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.data_format (
str
) – NDHWC” or “NCDHW”. Defaults to “NDHWC”. (default:'NDHWC'
)dilations (
int
) – The dilation factor for each dimension of input. (Default value = 1) (default:1
)out (
Optional
[Array
]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
- Returns
ret – The result of the convolution operation.
Examples
>>> x1 = [[[1.],[2.]],[[1.],[2.]],[[1.],[2.]]] >>> x2 = [[[3.],[4.]],[[3.],[4.]],[[3.],[4.]]] >>> x = ivy.array([[x1,x2]]) #NDHWC >>> filters = ivy.array([[[[[1]],[[0.]]]]]) #DHWIO >>> result = ivy.conv3d( x, filters, 1, 'VALID',data_format="NDHWC", dilations= 1) >>> print(result) ivy.array([[ [ [[1.]],[[1.]],[[1.]] ], [ [[3.]],[[3.]],[[3.]] ] ]])
- ivy.conv3d_transpose(x, filters, strides, padding, /, *, output_shape=None, data_format='NDHWC', dilations=1, out=None)[source]
Computes a 3-D transpose convolution given 5-D input x and filters arrays.
- Parameters
x (
Union
[Array
,NativeArray
]) – Input image [batch_size,d,h,w,d_in].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fd,fh,fw,d_in,d_out].strides (
Union
[int
,Tuple
[int
],Tuple
[int
,int
],Tuple
[int
,int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
Union
[str
,List
[int
]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.output_shape (
Optional
[Union
[Shape
,NativeShape
]]) – Shape of the output (Default value = None) (default:None
)data_format (
str
) – “NDHWC” or “NCDHW”. Defaults to “NDHWC”. (default:'NDHWC'
)dilations (
Union
[int
,Tuple
[int
],Tuple
[int
,int
],Tuple
[int
,int
,int
]]) – The dilation factor for each dimension of input. (Default value = 1) (default:1
)out (
Optional
[Array
]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
- Returns
ret – The result of the transpose convolution operation.
Functional Examples
With
ivy.Array
input:>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3]) >>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6]) >>> y = ivy.conv3d_transpose(x, filters, 2, 'SAME') >>> print(y.shape) (1, 6, 56, 56, 6)
With
ivy.NativeArray
input:>>> x = ivy.native_array( ivy.random_normal(mean=0, std=1, shape=[1, 7, 256, 256, 64]) ) >>> filters = ivy.native_array( ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 64, 32]) ) >>> y = ivy.conv3d_transpose(x, filters, [1, 1, 1], 'VALID') >>> print(y.shape) (1, 9, 258, 258, 32)
With :code: ivy.Container inputs:
>>> x = ivy.Container(a = ivy.random_normal( mean=0, std=1, shape=[1, 3, 28, 28, 3] ), b = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3])) >>> filters = ivy.Container(c = ivy.random_normal( mean=0, std=1, shape=[3, 3, 3, 3, 6] ), d = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6])) >>> y = ivy.conv3d_transpose(x, filters, 2, 'SAME') >>> print(y.shape) [1, 6, 56, 56, 6]
With a mix of
ivy.Array
andivy.Container
inputs:>>> x = ivy.full((1, 6, 6, 6, 1), 2.7) >>> a = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1]) >>> b = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1]) >>> filters = ivy.Container(a = a, b = b) >>> y = ivy.conv3d_transpose(x, filters, 1, 'VALID', dilations=1) >>> print(y.shape) [1, 8, 8, 8, 1]
With a mix of
ivy.Array
,ivy.NativeArray
andivy.Container
inputs:>>> x = ivy.full((1, 6, 6, 6, 1), 1.23) >>> a = ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1])) >>> b = ivy.native_array(ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 1, 1])) >>> filters = ivy.Container(a = a, b = b) >>> y = ivy.conv3d_transpose(x, filters, 1, 'VALID', dilations=1) >>> print(y.shape) [1, 8, 8, 8, 1]
Instance Method Examples
Using
ivy.Array
instance method:>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3]) >>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 6]) >>> y = x.conv3d_transpose(filters, 2, 'SAME') >>> print(y.shape) (1, 6, 56, 56, 6)
Using
ivy.Container
instance method:>>> x = ivy.Container(a = ivy.random_normal( mean=0, std=1, shape=[1, 3, 28, 28, 3] ), b = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3]))
>>> filters = ivy.Container(c = ivy.random_normal( mean=0, std=1, shape=[3, 3, 3, 3, 3] ), d = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 3, 3]))
>>> y = x.conv3d_transpose(filters, 2, "SAME") >>> print(y.shape) (1, 6, 56, 56, 3)
- ivy.depthwise_conv2d(x, filters, strides, padding, /, *, data_format='NHWC', dilations=1, out=None)[source]
Computes a 2-D depthwise convolution given 4-D input
x
and filters arrays.- Parameters
x (
Union
[Array
,NativeArray
]) – Input image [batch_size,h,w,d].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fh,fw,d_in]. (d_in must be the same as d from x)strides (
Union
[int
,Tuple
[int
],Tuple
[int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
Union
[str
,List
[int
]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.data_format (
str
) – “NHWC” or “NCHW”. Defaults to “NHWC”. (default:'NHWC'
)dilations (
Optional
[Union
[int
,Tuple
[int
],Tuple
[int
,int
]]]) – The dilation factor for each dimension of input. (Default value = 1) (default:1
)out (
Optional
[Array
]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
- Returns
ret – The result of the convolution operation.
Both the description and the type hints above assumes an array input for simplicity,
but this function is nestable, and therefore also accepts
ivy.Container
instances in place of any of the arguments.
Examples
With
ivy.Array
input:>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 28, 28, 3]) #NHWC >>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3]) #HWI (I == d_in) >>> y = ivy.depthwise_conv2d(x, filters, (1, 1), 'VALID') >>> print(y.shape) (1, 26, 26, 3)
>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 32, 32, 3]) #NHWC >>> y = ivy.zeros_like(x) >>> filters = ivy.random_normal(mean=0, std=1, shape=[5, 5, 3]) #HWI (I == d_in) >>> ivy.depthwise_conv2d(x, filters, [2, 2], 'SAME', out=y) >>> print(y.shape) (1, 16, 16, 3)
>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 64, 64, 32]) #NHWC >>> filters = ivy.random_normal(mean=0, std=1, shape=[4, 4, 32]) #HWI (I == d_in) >>> ivy.depthwise_conv2d(x, filters, [1, 1], 'VALID', out=x) >>> print(x.shape) (1, 61, 61, 32)
With
ivy.NativeArray
input:>>> x = ivy.native_array( ivy.random_normal(mean=0, std=1, shape=[1, 7, 7, 64]) ) #NHWC >>> filters = ivy.native_array( ivy.random_normal(mean=0, std=1, shape=[3, 3, 64]) ) #HWI (I == d_in) >>> y = ivy.depthwise_conv2d(x, filters, [1, 1], 'SAME') >>> print(y.shape) (1, 7, 7, 64)
With a mix of
ivy.Array
andivy.Container
inputs:>>> x = ivy.eye(6, 6).reshape((1, 6, 6, 1)) #NHWC >>> a = ivy.array([[1., 1., 1.], [1., -8., 1.], [1., 1., 1.]]).expand_dims(-1) >>> b = ivy.array([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.]]).expand_dims(-1) / 9.0 >>> filters = ivy.Container(a = a, b = b) >>> y = ivy.depthwise_conv2d(x, filters, 1, 'VALID', dilations=2) >>> print(y) { a: ivy.array([[[[-6.], [0.]], [[0.], [-6.]]]]), b: ivy.array([[[[0.333], [0.]], [[0.], [0.333]]]]) }
With a mix of
ivy.Array
, code:ivy.NativeArray andivy.Container
inputs:>>> x = ivy.eye(6, 6).reshape((1, 6, 6, 1)) #NHWC >>> y = ivy.native_array(ivy.eye(6, 6).reshape((1, 6, 6, 1))) >>> inp = ivy.Container(x = x, y = y) >>> filter = ivy.array([[1., 1., 1.], [1., -8., 1.], [1., 1., 1.]]).expand_dims(-1) >>> y = ivy.depthwise_conv2d(inp, filter, 1, 'VALID', dilations=2) >>> print(y) { x: ivy.array([[[[-6.], [0.]], [[0.], [-6.]]]]), y: ivy.array([[[[-6.],[0.]],[[0.],[-6.]]]]) }
- ivy.dropout(x, prob, /, *, scale=True, dtype=None, out=None)[source]
Randomly zeroes some elements of the input tensor with probability p using samples from a Bernoulli distribution.
- Parameters
x (
Union
[Array
,NativeArray
]) – The input array x to perform dropout on.prob (
float
) – The probability of zeroing out each array element.scale (
bool
) – Whether to scale the output by 1/(1-prob), default is True. (default:True
)out (
Optional
[Array
]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
- Returns
ret – Result array of the linear transformation. [N,∗,out_features]
- ivy.linear(x, weight, /, *, bias=None, out=None)[source]
Applies a linear transformation to the incoming data: y = x * t(weight) + bias. The operation also supports batching of the weight matrices. This is useful if a batch of different network parameters are to be represented.
- Parameters
x (
Union
[Array
,NativeArray
]) – The input x compute linear transformation on. [outer_batch_shape,inner_batch_shape,in_features]weight (
Union
[Array
,NativeArray
]) – The weight matrix. [outer_batch_shape,out_features,in_features]bias (
Optional
[Union
[Array
,NativeArray
]]) – The bias vector, default is None. [outer_batch_shape,out_features] (default:None
)out (
Optional
[Array
]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
- Returns
ret – Result array of the linear transformation. [outer_batch_shape,inner_batch_shape,out_features]
- ivy.lstm_update(x, init_h, init_c, kernel, recurrent_kernel, /, *, bias=None, recurrent_bias=None)[source]
Perform long-short term memory update by unrolling time dimension of input array.
- Parameters
x (
Union
[Array
,NativeArray
]) – input tensor of LSTM layer [batch_shape, t, in].init_h (
Union
[Array
,NativeArray
]) – initial state tensor for the cell output [batch_shape, out].init_c (
Union
[Array
,NativeArray
]) – initial state tensor for the cell hidden state [batch_shape, out].kernel (
Union
[Array
,NativeArray
]) – weights for cell kernel [in, 4 x out].recurrent_kernel (
Union
[Array
,NativeArray
]) – weights for cell recurrent kernel [out, 4 x out].bias (
Optional
[Union
[Array
,NativeArray
]]) – bias for cell kernel [4 x out]. (Default value = None) (default:None
)recurrent_bias (
Optional
[Union
[Array
,NativeArray
]]) – bias for cell recurrent kernel [4 x out]. (Default value = None) (default:None
)
- Return type
Tuple
[Any
,Union
[Array
,NativeArray
,Any
]]- Returns
ret – hidden state for all timesteps [batch_shape,t,out] and cell state for last timestep [batch_shape,out]
- ivy.multi_head_attention(x, scale, num_heads, /, *, context=None, mask=None, to_q_fn=None, to_kv_fn=None, to_out_fn=None, to_q_v=None, to_kv_v=None, to_out_v=None, out=None)[source]
Applies multi-head attention to inputs x.
- Parameters
x (
Union
[Array
,NativeArray
]) – The array to determine the queries from [batch_shape,num_queries,x_feat_dim].scale – The value by which to scale the query-key similarity measure before softmax.
num_heads – The number of attention heads to use.
context (
Optional
[Union
[Array
,NativeArray
]]) – The array to determine the keys and values from. Default is None. (default:None
) [batch_shape,num_keys,cont_feat_dim].mask (
Optional
[Union
[Array
,NativeArray
]]) – The mask to apply to the query-key values. Default is None. (default:None
) [batch_shape,num_queries,num_keys]to_q_fn (
Optional
[Callable
]) – The function to compute queries from input x, returning queries (default:None
) [batch_shape,num_queries,numheads×feat_dim]. (Default value = None)to_kv_fn (
Optional
[Callable
]) – The function to compute keys and values from the context. (Default value = None) (default:None
)to_out_fn (
Optional
[Callable
]) – The function to compute the output from the scaled dot-product attention. (default:None
) (Default value = None)to_q_v – The variables for function to_q_fn. Default is None.
to_kv_v – The variables for function to_kv_fn. Default is None.
to_out_v – The variables for function to_out_fn. Default is None.
out (
Optional
[Union
[Array
,NativeArray
]]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
Union
[Array
,NativeArray
]- Returns
ret – The output following application of multi-head attention. [batch_shape,num_queries,out_feat_dim]
- ivy.scaled_dot_product_attention(q, k, v, scale, /, *, mask=None, out=None)[source]
Applies scaled dot product attention to inputs x using optional mask.
- Parameters
q (
Union
[Array
,NativeArray
]) – The queries input array. The shape of queries input array should be in [batch_shape,num_queries,feat_dim]. The queries input array should have the same size as keys and values.k (
Union
[Array
,NativeArray
]) – The keys input array. The shape of keys input array should be in [batch_shape,num_keys,feat_dim]. The keys input array should have the same size as queries and values.v (
Union
[Array
,NativeArray
]) – The values input array. The shape of values input should be in [batch_shape,num_keys,feat_dim]. The values input array should have the same size as queries and keys.scale (
float
) – The scale float value. The scale float value is used to scale the query-key pairs before softmax.mask (
Optional
[Union
[Array
,NativeArray
]]) – The mask input array. The mask to apply to the query-key values. Default is (default:None
) None. The shape of mask input should be in [batch_shape,num_queries,num_keys].out (
Optional
[Array
]) – optional output array, for writing the result to. It must have a shape that the (default:None
) inputs broadcast to.
- Return type
Union
[Array
,NativeArray
]- Returns
ret – The output following application of scaled dot-product attention. The output array is the weighted sum produced by the attention score and value. The shape of output array is [batch_shape,num_queries,feat_dim] .
Both the description and the type hints above assumes an array input for simplicity,
but this function is nestable, and therefore also accepts
ivy.Container
instances in place of any of the arguments.
Functional Examples
With
ivy.Array
input:>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]]) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1) >>> print(result) ivy.array([[[4.04,5.03],[4.3,5.3],[4.3,5.3]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]]) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]]) >>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0],[0.0, 0.0, 0.0]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask) >>> print(result) ivy.array([[[nan,nan],[nan,nan],[nan,nan]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]) >>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]]) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]]) >>> out = ivy.zeros(shape=(1, 3, 2)) >>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out) >>> print(out) ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])
With
ivy.NativeArray
input:>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]]) >>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1) >>> print(result) ivy.array([[[4.04,5.03],[4.3,5.3],[4.3,5.3]]])
>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]]) >>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]]) >>> mask = ivy.native_array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0],[0.0, 0.0, 0.0]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask) >>> print(result) ivy.array([[[nan,nan],[nan,nan],[nan,nan]]])
>>> q = ivy.native_array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]) >>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]]) >>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]]) >>> out = ivy.zeros(shape=(1, 3, 2)) >>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out) >>> print(out) ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])
With
ivy.Container
input:>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]), b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]])) >>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]), b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]])) >>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.], [4.4, 5.6]]]), b=ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1) >>> print(result) {a:ivy.array([[[4.27,5.4],[4.4,5.6],[4.4,5.6]]]),b:ivy.array([[[4.35,5.54],[4.4,5.6],[4.4,5.6]]])}
>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]), b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]])) >>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]), b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]])) >>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.], [4.4, 5.6]]]), b=ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]])) >>> mask = ivy.Container(a=ivy.array([[[1.0, 1.0, 1.0],[1.0, 1.0, 1.0],[1.0, 1.0, 1.0]]]), b=ivy.array([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]])) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask) >>> print(result) { a: ivy.array([[[4.27, 5.4], [4.4, 5.6], [4.4, 5.6]]]), b: ivy.array([[[4.35, 5.54], [4.4, 5.6], [4.4, 5.6]]]) }
With a mix of
ivy.Array
andivy.NativeArray
inputs:>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]]) >>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1) >>> print(result) ivy.array([[ [4.04, 5.03], [4.3 , 5.3 ], [4.3 , 5.3 ] ]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]) >>> k = ivy.native_array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]]) >>> v = ivy.native_array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]]) >>> out = ivy.zeros(shape=(1, 3, 2)) >>> ivy.scaled_dot_product_attention(q, k, v, 1, out=out) >>> print(out) ivy.array([[[4.04, 5.03],[4.3 , 5.3 ],[4.3 , 5.3 ]]])
With a mix of
ivy.Array
andivy.Container
inputs:>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3], [4.4, 5.6]]]), b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]])) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1) >>> print(result) { a: ivy.array([[[4.14, 5.13], [4.3, 5.3], [4.3, 5.3]]]), b: ivy.array([[[4.09, 5.08], [4.3, 5.3], [4.3, 5.3]]]) }
Instance Method Examples
With
ivy.Array
input:>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]) >>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]]) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]]) >>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask) >>> print(result) ivy.array([[[nan,nan],[nan,nan],[nan,nan]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]) >>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]]) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]]) >>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]]) >>> out = ivy.zeros(shape=(1, 3, 2)) >>> ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask, out=out) >>> print(out) ivy.array([[[nan, nan],[nan, nan],[nan, nan]]])
With
ivy.Container
input:>>> q = ivy.Container(a=ivy.array([[[0.2, 1.], [2.7, 3.], [4.4, 5.6]]]), b=ivy.array([[[1.2, 1.], [2.2, 3.], [4.4, 5.6]]])) >>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3],[4.4, 5.6]]]), b=ivy.array([[[3.2, 1.], [2.2, 3.6], [4.0, 5.6]]])) >>> v = ivy.Container(a=ivy.array([[[5.2, 1.], [2.1, 3.],[4.4, 5.6]]]), b=ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]])) >>> mask = ivy.Container(a=ivy.array([[[1.0, 1.0, 1.0],[1.0, 1.0, 1.0],[1.0, 1.0, 1.0]]]), b=ivy.array([[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]])) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1, mask=mask) >>> print(result) { a: ivy.array([[[4.27, 5.4], [4.4, 5.6], [4.4, 5.6]]]), b: ivy.array([[[4.35, 5.54], [4.4, 5.6], [4.4, 5.6]]]) }
With a mix of
ivy.Array
andivy.Container
inputs:>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.Container(a=ivy.array([[[4.2, 1.], [2.2, 3.3],[4.4, 5.6]]]), b=ivy.array([[[3.2, 1.], [2.2, 3.6],[4.0, 5.6]]])) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]]) >>> mask = ivy.native_array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, 1) >>> print(result) { a: ivy.array([[[4.14, 5.13], [4.3, 5.3], [4.3, 5.3]]]), b: ivy.array([[[4.09, 5.08], [4.3, 5.3], [4.3, 5.3]]]) }