deepmd.dpmodel.descriptor.dpa1

deepmd.dpmodel.descriptor.dpa1#

Classes#

`DescrptDPA1`	Attention-based descriptor which is proposed in the pretrainable DPA-1[1] model.
`DescrptBlockSeAtten`	The unit operation of a native model.
`NeighborGatedAttention`	The unit operation of a native model.
`NeighborGatedAttentionLayer`	The unit operation of a native model.
`GatedAttentionLayer`	The unit operation of a native model.

Functions#

`np_softmax`(→ deepmd.dpmodel.array_api.Array)
`np_normalize`(→ deepmd.dpmodel.array_api.Array)

Module Contents#

deepmd.dpmodel.descriptor.dpa1.np_softmax(x: deepmd.dpmodel.array_api.Array, axis: int = -1) → deepmd.dpmodel.array_api.Array[source]#

deepmd.dpmodel.descriptor.dpa1.np_normalize(x: deepmd.dpmodel.array_api.Array, axis: int = -1) → deepmd.dpmodel.array_api.Array[source]#

class deepmd.dpmodel.descriptor.dpa1.DescrptDPA1(rcut: float, rcut_smth: float, sel: list[int] | int, ntypes: int, neuron: list[int] = [25, 50, 100], axis_neuron: int = 8, tebd_dim: int = 8, tebd_input_mode: str = 'concat', resnet_dt: bool = False, trainable: bool = True, type_one_side: bool = False, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, exclude_types: list[tuple[int, int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = DEFAULT_PRECISION, scaling_factor: float = 1.0, normalize: bool = True, temperature: float | None = None, trainable_ln: bool = True, ln_eps: float | None = 1e-05, smooth_type_embedding: bool = True, concat_output_tebd: bool = True, spin: None = None, stripped_type_embedding: bool | None = None, use_econf_tebd: bool = False, use_tebd_bias: bool = False, type_map: list[str] | None = None, seed: int | list[int] | None = None)[source]#

Bases: deepmd.dpmodel.NativeOP, deepmd.dpmodel.descriptor.base_descriptor.BaseDescriptor

Attention-based descriptor which is proposed in the pretrainable DPA-1[1] model.

This descriptor, \(\mathcal{D}^i \in \mathbb{R}^{M \times M_{<}}\), is given by

\[\mathcal{D}^i = \frac{1}{N_c^2}(\hat{\mathcal{G}}^i)^T \mathcal{R}^i (\mathcal{R}^i)^T \hat{\mathcal{G}}^i_<,\]

where \(\hat{\mathcal{G}}^i\) represents the embedding matrix:math:mathcal{G}^i after additional self-attention mechanism and \(\mathcal{R}^i\) is defined by the full case in the se_e2_a descriptor. Note that we obtain \(\mathcal{G}^i\) using the type embedding method by default in this descriptor.

To perform the self-attention mechanism, the queries \(\mathcal{Q}^{i,l} \in \mathbb{R}^{N_c\times d_k}\), keys \(\mathcal{K}^{i,l} \in \mathbb{R}^{N_c\times d_k}\), and values \(\mathcal{V}^{i,l} \in \mathbb{R}^{N_c\times d_v}\) are first obtained:

\[\left(\mathcal{Q}^{i,l}\right)_{j}=Q_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),\]

\[\left(\mathcal{K}^{i,l}\right)_{j}=K_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),\]

\[\left(\mathcal{V}^{i,l}\right)_{j}=V_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),\]

where \(Q_{l}\), \(K_{l}\), \(V_{l}\) represent three trainable linear transformations that output the queries and keys of dimension \(d_k\) and values of dimension \(d_v\), and \(l\) is the index of the attention layer. The input embedding matrix to the attention layers, denoted by \(\mathcal{G}^{i,0}\), is chosen as the two-body embedding matrix.

Then the scaled dot-product attention method is adopted:

\[A(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}, \mathcal{V}^{i,l}, \mathcal{R}^{i,l})=\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right)\mathcal{V}^{i,l},\]

where \(\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right) \in \mathbb{R}^{N_c\times N_c}\) is attention weights. In the original attention method, one typically has \(\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}\right)=\mathrm{softmax}\left(\frac{\mathcal{Q}^{i,l} (\mathcal{K}^{i,l})^{T}}{\sqrt{d_{k}}}\right)\), with \(\sqrt{d_{k}}\) being the normalization temperature. This is slightly modified to incorporate the angular information:

\[\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right) = \mathrm{softmax}\left(\frac{\mathcal{Q}^{i,l} (\mathcal{K}^{i,l})^{T}}{\sqrt{d_{k}}}\right) \odot \hat{\mathcal{R}}^{i}(\hat{\mathcal{R}}^{i})^{T},\]

where \(\hat{\mathcal{R}}^{i} \in \mathbb{R}^{N_c\times 3}\) denotes normalized relative coordinates,: \(\hat{\mathcal{R}}^{i}_{j} = \frac{\boldsymbol{r}_{ij}}{\lVert \boldsymbol{r}_{ij} \lVert}\) and \(\odot\) means element-wise multiplication.
Then layer normalization is added in a residual way to finally obtain the self-attention local embedding matrix: \(\hat{\mathcal{G}}^{i} = \mathcal{G}^{i,L_a}\) after \(L_a\) attention layers:[^1]

\[\mathcal{G}^{i,l} = \mathcal{G}^{i,l-1} + \mathrm{LayerNorm}(A(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}, \mathcal{V}^{i,l}, \mathcal{R}^{i,l})).\]

Parameters:

rcut: float: The cut-off radius \(r_c\)
rcut_smth: float: From where the environment matrix should be smoothed \(r_s\)
sellist[int], int: list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius int: the total maxmum number of atoms in the cut-off radius
ntypesint: Number of element types
neuronlist[int]: Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)
axis_neuron: int: Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)
tebd_dim: int: Dimension of the type embedding
tebd_input_mode: str: The input mode of the type embedding. Supported modes are [“concat”, “strip”]. - “concat”: Concatenate the type embedding with the smoothed radial information as the union input for the embedding network. - “strip”: Use a separated embedding network for the type embedding and combine the output with the radial embedding network output.
resnet_dt: bool: Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)
trainable: bool: If the weights of this descriptors are trainable.
trainable_ln: bool: Whether to use trainable shift and scale weights in layer normalization.
ln_eps: float, Optional: The epsilon value for layer normalization.
type_one_side: bool: If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
attn: int: Hidden dimension of the attention vectors
attn_layer: int: Number of attention layers
attn_dotr: bool: If dot the angular gate to the attention weights
attn_mask: bool: (Only support False to keep consistent with other backend references.) (Not used in this version. True option is not implemented.) If mask the diagonal of attention weights
exclude_typeslist[list[int]]: The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection: float: Protection parameter to prevent division by zero errors during environment matrix calculations.
set_davg_zero: bool: Set the shift of embedding net input to zero.
activation_function: str: The activation function in the embedding net. Supported options are “relu6”, “gelu”, “gelu_tf”, “silu”, “softplus”, “none”, “linear”, “tanh”, “relu”, “silut”, “sigmoid”.
precision: str: The precision of the embedding net parameters. Supported options are “float16”, “default”, “bfloat16”, “float32”, “float64”.
scaling_factor: float: The scaling factor of normalization in calculations of attention weights. If temperature is None, the scaling of attention weights is (N_dim * scaling_factor)**0.5
normalize: bool: Whether to normalize the hidden vectors in attention weights calculation.
temperature: float: If not None, the scaling of attention weights is temperature itself.
smooth_type_embedding: bool: Whether to use smooth process in attention weights calculation.
concat_output_tebd: bool: Whether to concat type embedding at the output of the descriptor.
stripped_type_embedding: bool, Optional: (Deprecated, kept only for compatibility.) Whether to strip the type embedding into a separate embedding network. Setting this parameter to True is equivalent to setting tebd_input_mode to ‘strip’. Setting it to False is equivalent to setting tebd_input_mode to ‘concat’. The default value is None, which means the tebd_input_mode setting will be used instead.
use_econf_tebd: bool, Optional: Whether to use electronic configuration type embedding.
use_tebd_biasbool, Optional: Whether to use bias in the type embedding layer.
type_map: list[str], Optional: A list of strings. Give the name to each type of atoms.
spin: (Only support None to keep consistent with other backend references.) (Not used in this version. Not-none option is not implemented.) The old implementation of deepspin.

References

[1]

Duo Zhang, Hangrui Bi, Fu-Zhi Dai, Wanrun Jiang, Linfeng Zhang, and Han Wang. 2022. DPA-1: Pretraining of Attention-based Deep Potential Model for Molecular Simulation. arXiv preprint arXiv:2208.08236.

se_atten[source]#

use_econf_tebd = False[source]#

use_tebd_bias = False[source]#

type_map = None[source]#

type_embedding[source]#

tebd_dim = 8[source]#

concat_output_tebd = True[source]#

trainable = True[source]#

precision = 'float64'[source]#

get_rcut() → float[source]#: Returns the cut-off radius.

get_rcut_smth() → float[source]#: Returns the radius where the neighbor information starts to smoothly decay to 0.

get_nsel() → int[source]#: Returns the number of selected atoms in the cut-off radius.

get_sel() → list[int][source]#: Returns the number of selected atoms for each type.

get_ntypes() → int[source]#: Returns the number of element types.

get_type_map() → list[str][source]#: Get the name to each type of atoms.

get_dim_out() → int[source]#: Returns the output dimension.

get_dim_emb() → int[source]#: Returns the embedding dimension of g2.

mixed_types() → bool[source]#

If true, the descriptor 1. assumes total number of atoms aligned across frames; 2. requires a neighbor list that does not distinguish different atomic types.

If false, the descriptor 1. assumes total number of atoms of each atom type aligned across frames; 2. requires a neighbor list that distinguishes different atomic types.

has_message_passing() → bool[source]#: Returns whether the descriptor has message passing.

need_sorted_nlist_for_lower() → bool[source]#: Returns whether the descriptor needs sorted nlist when using forward_lower.

get_env_protection() → float[source]#: Returns the protection of building environment matrix.

abstractmethod share_params(base_class: DescrptDPA1, shared_level: int, resume: bool = False) → NoReturn[source]#: Share the parameters of self to the base_class with shared_level during multitask training. If not start from checkpoint (resume is False), some separated parameters (e.g. mean and stddev) will be re-calculated across different classes.

property dim_out: int[source]#

property dim_emb: int[source]#

compute_input_stats(merged: Callable[[], list[dict]] | list[dict], path: deepmd.utils.path.DPPath | None = None) → None[source]#

Compute the input statistics (e.g. mean and stddev) for the descriptors from packed data.

Parameters:

mergedUnion[Callable[[], list[dict]], list[dict]]

list[dict]: A list of data samples from various data systems.
Each element, merged[i], is a data dictionary containing keys: torch.Tensor originating from the i-th data system.
Callable[[], list[dict]]: A lazy function that returns data samples in the above format
only when needed. Since the sampling process can be slow and memory-intensive, the lazy function helps by only sampling once.

pathOptional[DPPath]

The path to the stat file.

set_stat_mean_and_stddev(mean: deepmd.dpmodel.array_api.Array, stddev: deepmd.dpmodel.array_api.Array) → None[source]#: Update mean and stddev for descriptor.

get_stat_mean_and_stddev() → tuple[deepmd.dpmodel.array_api.Array, deepmd.dpmodel.array_api.Array][source]#: Get mean and stddev for descriptor.

change_type_map(type_map: list[str], model_with_new_type_stat: DescrptDPA1 | None = None) → None[source]#: Change the type related params to new ones, according to type_map and the original one in the model. If there are new types in type_map, statistics will be updated accordingly to model_with_new_type_stat for these new types.

call(coord_ext: deepmd.dpmodel.array_api.Array, atype_ext: deepmd.dpmodel.array_api.Array, nlist: deepmd.dpmodel.array_api.Array, mapping: deepmd.dpmodel.array_api.Array | None = None) → deepmd.dpmodel.array_api.Array[source]#

Compute the descriptor.

Parameters:

coord_ext: The extended coordinates of atoms. shape: nf x (nallx3)
atype_ext: The extended aotm types. shape: nf x nall
nlist: The neighbor list. shape: nf x nloc x nnei
mapping: The index mapping from extended to local region. not used by this descriptor.

Returns:

descriptor: The descriptor. shape: nf x nloc x (ng x axis_neuron)
gr: The rotationally equivariant and permutationally invariant single particle representation. shape: nf x nloc x ng x 3
g2: The rotationally invariant pair-partical representation. this descriptor returns None
h2: The rotationally equivariant pair-partical representation. this descriptor returns None
sw: The smooth switch function.

serialize() → dict[source]#: Serialize the descriptor to dict.

classmethod deserialize(data: dict) → DescrptDPA1[source]#: Deserialize from dict.

classmethod update_sel(train_data: deepmd.utils.data_system.DeepmdDataSystem, type_map: list[str] | None, local_jdata: dict) → tuple[deepmd.dpmodel.array_api.Array, deepmd.dpmodel.array_api.Array][source]#

Update the selection and perform neighbor statistics.

Parameters:

train_dataDeepmdDataSystem: data used to do neighbor statistics
type_maplist[str], optional: The name of each type of atoms
local_jdatadict: The local data refer to the current class

Returns:

dict: The updated local data
float: The minimum distance between two atoms

class deepmd.dpmodel.descriptor.dpa1.DescrptBlockSeAtten(rcut: float, rcut_smth: float, sel: list[int] | int, ntypes: int, neuron: list[int] = [25, 50, 100], axis_neuron: int = 8, tebd_dim: int = 8, tebd_input_mode: str = 'concat', resnet_dt: bool = False, type_one_side: bool = False, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, exclude_types: list[tuple[int, int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = DEFAULT_PRECISION, scaling_factor: float = 1.0, normalize: bool = True, temperature: float | None = None, trainable_ln: bool = True, ln_eps: float | None = 1e-05, smooth: bool = True, seed: int | list[int] | None = None, trainable: bool = True)[source]#

Bases: deepmd.dpmodel.NativeOP, deepmd.dpmodel.descriptor.descriptor.DescriptorBlock

The unit operation of a native model.

rcut[source]#

rcut_smth[source]#

sel[source]#

nnei[source]#

ntypes[source]#

neuron = [25, 50, 100][source]#

filter_neuron = [25, 50, 100][source]#

axis_neuron = 8[source]#

tebd_dim = 8[source]#

tebd_input_mode = 'concat'[source]#

resnet_dt = False[source]#

trainable_ln = True[source]#

ln_eps = 1e-05[source]#

type_one_side = False[source]#

attn = 128[source]#

attn_layer = 2[source]#

attn_dotr = True[source]#

attn_mask = False[source]#

exclude_types = [][source]#

env_protection = 0.0[source]#

set_davg_zero = False[source]#

activation_function = 'tanh'[source]#

precision = 'float64'[source]#

scaling_factor = 1.0[source]#

normalize = True[source]#

temperature = None[source]#

smooth = True[source]#

tebd_dim_input = 16[source]#

embeddings[source]#

dpa1_attention[source]#

env_mat[source]#

mean[source]#

stddev[source]#

orig_sel[source]#

get_rcut() → float[source]#: Returns the cut-off radius.

get_rcut_smth() → float[source]#: Returns the radius where the neighbor information starts to smoothly decay to 0.

get_nsel() → int[source]#: Returns the number of selected atoms in the cut-off radius.

get_sel() → list[int][source]#: Returns the number of selected atoms for each type.

get_ntypes() → int[source]#: Returns the number of element types.

get_dim_in() → int[source]#: Returns the input dimension.

get_dim_out() → int[source]#: Returns the output dimension.

get_dim_emb() → int[source]#: Returns the output dimension of embedding.

__setitem__(key: str, value: deepmd.dpmodel.array_api.Array) → None[source]#

__getitem__(key: str) → deepmd.dpmodel.array_api.Array[source]#

mixed_types() → bool[source]#

If true, the descriptor 1. assumes total number of atoms aligned across frames; 2. requires a neighbor list that does not distinguish different atomic types.

If false, the descriptor 1. assumes total number of atoms of each atom type aligned across frames; 2. requires a neighbor list that distinguishes different atomic types.

get_env_protection() → float[source]#: Returns the protection of building environment matrix.

property dim_out: int[source]#: Returns the output dimension of this descriptor.

property dim_in: int[source]#: Returns the atomic input dimension of this descriptor.

property dim_emb: int[source]#: Returns the output dimension of embedding.

compute_input_stats(merged: Callable[[], list[dict]] | list[dict], path: deepmd.utils.path.DPPath | None = None) → None[source]#

Compute the input statistics (e.g. mean and stddev) for the descriptors from packed data.

Parameters:

mergedUnion[Callable[[], list[dict]], list[dict]]

list[dict]: A list of data samples from various data systems.
Each element, merged[i], is a data dictionary containing keys: paddle.Tensor originating from the i-th data system.
Callable[[], list[dict]]: A lazy function that returns data samples in the above format
only when needed. Since the sampling process can be slow and memory-intensive, the lazy function helps by only sampling once.

pathOptional[DPPath]

The path to the stat file.

get_stats() → dict[str, deepmd.utils.env_mat_stat.StatItem][source]#: Get the statistics of the descriptor.

reinit_exclude(exclude_types: list[tuple[int, int]] = []) → None[source]#

cal_g(ss: deepmd.dpmodel.array_api.Array, embedding_idx: int) → deepmd.dpmodel.array_api.Array[source]#

cal_g_strip(ss: deepmd.dpmodel.array_api.Array, embedding_idx: int) → deepmd.dpmodel.array_api.Array[source]#

call(nlist: deepmd.dpmodel.array_api.Array, coord_ext: deepmd.dpmodel.array_api.Array, atype_ext: deepmd.dpmodel.array_api.Array, atype_embd_ext: deepmd.dpmodel.array_api.Array | None = None, mapping: deepmd.dpmodel.array_api.Array | None = None, type_embedding: deepmd.dpmodel.array_api.Array | None = None) → tuple[deepmd.dpmodel.array_api.Array, deepmd.dpmodel.array_api.Array][source]#: Forward pass in NumPy implementation.

has_message_passing() → bool[source]#: Returns whether the descriptor block has message passing.

need_sorted_nlist_for_lower() → bool[source]#: Returns whether the descriptor block needs sorted nlist when using forward_lower.

serialize() → dict[source]#: Serialize the descriptor to dict.

classmethod deserialize(data: dict) → DescrptDPA1[source]#: Deserialize from dict.

class deepmd.dpmodel.descriptor.dpa1.NeighborGatedAttention(layer_num: int, nnei: int, embed_dim: int, hidden_dim: int, dotr: bool = False, do_mask: bool = False, scaling_factor: float = 1.0, normalize: bool = True, temperature: float | None = None, trainable_ln: bool = True, ln_eps: float = 1e-05, smooth: bool = True, precision: str = DEFAULT_PRECISION, seed: int | list[int] | None = None, trainable: bool = True)[source]#

Bases: deepmd.dpmodel.NativeOP

The unit operation of a native model.

layer_num[source]#

nnei[source]#

embed_dim[source]#

hidden_dim[source]#

dotr = False[source]#

do_mask = False[source]#

scaling_factor = 1.0[source]#

normalize = True[source]#

temperature = None[source]#

trainable_ln = True[source]#

ln_eps = 1e-05[source]#

smooth = True[source]#

precision = 'float64'[source]#

network_type[source]#

attention_layers[source]#

call(input_G: deepmd.dpmodel.array_api.Array, nei_mask: deepmd.dpmodel.array_api.Array, input_r: deepmd.dpmodel.array_api.Array | None = None, sw: deepmd.dpmodel.array_api.Array | None = None) → deepmd.dpmodel.array_api.Array[source]#: Forward pass in NumPy implementation.

__getitem__(key: int) → NeighborGatedAttentionLayer[source]#

__setitem__(key: int, value: NeighborGatedAttentionLayer | dict) → None[source]#

serialize() → dict[source]#

Serialize the networks to a dict.

Returns:

dict: The serialized networks.

classmethod deserialize(data: dict) → NeighborGatedAttention[source]#

Deserialize the networks from a dict.

Parameters:

datadict: The dict to deserialize from.

class deepmd.dpmodel.descriptor.dpa1.NeighborGatedAttentionLayer(nnei: int, embed_dim: int, hidden_dim: int, dotr: bool = False, do_mask: bool = False, scaling_factor: float = 1.0, normalize: bool = True, temperature: float | None = None, trainable_ln: bool = True, ln_eps: float = 1e-05, smooth: bool = True, precision: str = DEFAULT_PRECISION, seed: int | list[int] | None = None, trainable: bool = True)[source]#

Bases: deepmd.dpmodel.NativeOP

The unit operation of a native model.

nnei[source]#

embed_dim[source]#

hidden_dim[source]#

dotr = False[source]#

do_mask = False[source]#

scaling_factor = 1.0[source]#

normalize = True[source]#

temperature = None[source]#

trainable_ln = True[source]#

ln_eps = 1e-05[source]#

precision = 'float64'[source]#

attention_layer[source]#

attn_layer_norm[source]#

call(x: deepmd.dpmodel.array_api.Array, nei_mask: deepmd.dpmodel.array_api.Array, input_r: deepmd.dpmodel.array_api.Array | None = None, sw: deepmd.dpmodel.array_api.Array | None = None) → deepmd.dpmodel.array_api.Array[source]#: Forward pass in NumPy implementation.

serialize() → dict[source]#

Serialize the networks to a dict.

Returns:

dict: The serialized networks.

classmethod deserialize(data: dict) → NeighborGatedAttentionLayer[source]#

Deserialize the networks from a dict.

Parameters:

datadict: The dict to deserialize from.

class deepmd.dpmodel.descriptor.dpa1.GatedAttentionLayer(nnei: int, embed_dim: int, hidden_dim: int, num_heads: int = 1, dotr: bool = False, do_mask: bool = False, scaling_factor: float = 1.0, normalize: bool = True, temperature: float | None = None, bias: bool = True, smooth: bool = True, precision: str = DEFAULT_PRECISION, seed: int | list[int] | None = None, trainable: bool = True)[source]#

Bases: deepmd.dpmodel.NativeOP

The unit operation of a native model.

nnei[source]#

embed_dim[source]#

hidden_dim[source]#

num_heads = 1[source]#

head_dim[source]#

dotr = False[source]#

do_mask = False[source]#

bias = True[source]#

smooth = True[source]#

scaling_factor = 1.0[source]#

temperature = None[source]#

precision = 'float64'[source]#

scaling[source]#

normalize = True[source]#

in_proj[source]#

out_proj[source]#

call(query: deepmd.dpmodel.array_api.Array, nei_mask: deepmd.dpmodel.array_api.Array, input_r: deepmd.dpmodel.array_api.Array | None = None, sw: deepmd.dpmodel.array_api.Array | None = None, attnw_shift: float = 20.0) → tuple[deepmd.dpmodel.array_api.Array, deepmd.dpmodel.array_api.Array][source]#: Forward pass in NumPy implementation.

serialize() → dict[source]#

classmethod deserialize(data: dict) → GatedAttentionLayer[source]#