deepmd.dpmodel.descriptor.dpa4_nn.so3

deepmd.dpmodel.descriptor.dpa4_nn.so3#

SO(3)-equivariant linear layers for DPA4/SeZM.

This module defines the channel-only and focus-aware linear maps used by SeZM SO(3) feature transformations.

This module is the dpmodel (array-API) port of deepmd.pt.model.descriptor.sezm_nn.so3.

Classes#

`FocusLinear`	Per-focus linear projection on the last feature axis.
`ChannelLinear`	Channel-only linear projection on the last feature axis.
`SO3Linear`	Focus-aware degree-wise linear self-interaction.

Module Contents#

class deepmd.dpmodel.descriptor.dpa4_nn.so3.FocusLinear(*, in_channels: int, out_channels: int, n_focus: int, precision: str = DEFAULT_PRECISION, bias: bool = True, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#

Bases: deepmd.dpmodel.NativeOP

Per-focus linear projection on the last feature axis.

Parameters:

in_channels: Input feature dimension.
out_channels: Output feature dimension.
n_focus: Number of focus streams.
precision: Parameter precision.
bias: Whether to use bias.
trainable: Whether parameters are trainable.
seed: Random seed for initialization.
init_std: If given, use normal(0, init_std) instead of default uniform init. Useful for gate projections where small initial logits are desired.

Notes

Parameters are stored in (in, out) convention to match Muon’s rectangular correction assumption (rows=fan_in, cols=fan_out): - weight: (in_channels, n_focus * out_channels) - bias: (n_focus * out_channels,)

in_channels[source]#

out_channels[source]#

n_focus[source]#

precision = 'float64'[source]#

trainable = True[source]#

use_bias = True[source]#

weight[source]#

call(x: Any) → Any[source]#

Parameters:

x: Input array with shape (B, F, Cin).

Returns:

Array: Projected array with shape (B, F, Cout).

serialize() → dict[str, Any][source]#: Serialize the FocusLinear to a dict.

classmethod deserialize(data: dict[str, Any]) → FocusLinear[source]#: Deserialize a FocusLinear from a dict.

class deepmd.dpmodel.descriptor.dpa4_nn.so3.ChannelLinear(*, in_channels: int, out_channels: int, precision: str = DEFAULT_PRECISION, bias: bool = True, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#

Bases: deepmd.dpmodel.NativeOP

Channel-only linear projection on the last feature axis.

Parameters:

in_channels: Input feature dimension.
out_channels: Output feature dimension.
precision: Parameter precision.
bias: Whether to use bias.
trainable: Whether parameters are trainable.
seed: Random seed for initialization.
init_std: If given, use normal(0, init_std) instead of default uniform init. Useful for gate projections where small initial logits are desired.

Notes

Parameters are stored in (in, out) convention to match Muon’s rectangular correction assumption (rows=fan_in, cols=fan_out): - weight: (in_channels, out_channels) - bias: (out_channels,)

in_channels[source]#

out_channels[source]#

precision = 'float64'[source]#

trainable = True[source]#

use_bias = True[source]#

weight[source]#

call(x: Any) → Any[source]#

Parameters:

x: Input array with shape (..., C_in).

Returns:

Array: Projected array with shape (..., C_out).

serialize() → dict[str, Any][source]#: Serialize the ChannelLinear to a dict.

classmethod deserialize(data: dict[str, Any]) → ChannelLinear[source]#: Deserialize a ChannelLinear from a dict.

class deepmd.dpmodel.descriptor.dpa4_nn.so3.SO3Linear(*, lmax: int, in_channels: int, out_channels: int, n_focus: int = 1, precision: str = DEFAULT_PRECISION, mlp_bias: bool = False, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#

Bases: deepmd.dpmodel.NativeOP

Focus-aware degree-wise linear self-interaction.

This vectorized implementation avoids Python loops by using torch.einsum and index_select. The key insight is that weights are shared across all m components within each l block.

Parameters:

lmax: Maximum spherical harmonic degree.
in_channels: Number of input channels per (l, m) coefficient.
out_channels: Number of output channels per (l, m) coefficient.
n_focus: Number of focus streams.
precision: Parameter precision.
mlp_bias: Whether to use bias for l=0 (scalar) components.
trainable: Whether parameters are trainable.
seed: Random seed for weight initialization.
init_std: If given, use normal(0, init_std) for all weights instead of default trunc-normal fan-in/fan-out init. Use 0.0 for zero initialization.

Notes

Weight storage: (lmax+1, C_in, F*C_out).
Bias storage: (F*C_out,), only applied to l=0 scalar components.
Runtime view restores weights to (lmax+1, C_in, F, C_out) via reshape.
expand_index maps each packed (l,m) position to its l value.
Einsum ndfi,difo->ndfo keeps the whole multi-focus path vectorized.
In HybridMuon slice mode, each (C_in, F*C_out) slice gets independent NS update with stable rectangular scaling.

lmax[source]#

in_channels[source]#

out_channels[source]#

n_focus = 1[source]#

precision = 'float64'[source]#

trainable = True[source]#

ebed_dim = 1[source]#

mlp_bias = False[source]#

weight[source]#

expand_index[source]#

call(x: Any) → Any[source]#

Parameters:

x: Input features with shape (N, D, F, C_in) where D=(lmax+1)^2.

Returns:

Array: Order-wise mixed features with shape (N, D, F, C_out).

serialize() → dict[str, Any][source]#: Serialize the SO3Linear to a dict.

classmethod deserialize(data: dict[str, Any]) → SO3Linear[source]#: Deserialize an SO3Linear from a dict.