`deepmd.tf.descriptor.se_atten`

Module Contents

Classes

`DescrptSeAtten`	Smooth version descriptor with attention.
`DescrptDPA1Compat`	Consistent version of the model for testing with other backend references.

Attributes

log

deepmd.tf.descriptor.se_atten.log[source]

class deepmd.tf.descriptor.se_atten.DescrptSeAtten(rcut: float, rcut_smth: float, sel: List[int] | int, ntypes: int, neuron: List[int] = [25, 50, 100], axis_neuron: int = 8, resnet_dt: bool = False, trainable: bool = True, seed: int | None = None, type_one_side: bool = True, set_davg_zero: bool = True, exclude_types: List[List[int]] = [], activation_function: str = 'tanh', precision: str = 'default', uniform_seed: bool = False, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, multi_task: bool = False, stripped_type_embedding: bool = False, smooth_type_embedding: bool = False, scaling_factor=1.0, normalize=True, temperature=None, trainable_ln: bool = True, ln_eps: float | None = 0.001, concat_output_tebd: bool = True, env_protection: float = 0.0, **kwargs)[source]

Bases: deepmd.tf.descriptor.se_a.DescrptSeA

Smooth version descriptor with attention.

Parameters:

rcut: float: The cut-off radius \(r_c\)
rcut_smth: float: From where the environment matrix should be smoothed \(r_s\)
sel: list[int], int: list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius int: the total maxmum number of atoms in the cut-off radius
neuron: list[int]: Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)
axis_neuron: int: Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)
resnet_dt: bool: Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)
trainable: bool: If the weights of embedding net are trainable.
seed: int, Optional: Random seed for initializing the network parameters.
type_one_side: bool: Try to build N_types embedding nets. Otherwise, building N_types^2 embedding nets
exclude_typesList[List[int]]: The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
set_davg_zero: bool: Set the shift of embedding net input to zero.
activation_function: str: The activation function in the embedding net. Supported options are “relu”, “tanh”, “none”, “linear”, “softplus”, “sigmoid”, “relu6”, “gelu”, “gelu_tf”.
precision: str: The precision of the embedding net parameters. Supported options are “float32”, “default”, “float16”, “float64”.
uniform_seed: bool: Only for the purpose of backward compatibility, retrieves the old behavior of using the random seed
attn: int: The length of hidden vector during scale-dot attention computation.
attn_layer: int: The number of layers in attention mechanism.
attn_dotr: bool: Whether to dot the relative coordinates on the attention weights as a gated scheme.
attn_mask: bool: Whether to mask the diagonal in the attention weights.
ln_eps: float, Optional: The epsilon value for layer normalization.
multi_task: bool: If the model has multi fitting nets to train.
stripped_type_embedding: bool: Whether to strip the type embedding into a separated embedding network. Default value will be True in se_atten_v2 descriptor.
smooth_type_embedding: bool: Whether to use smooth process in attention weights calculation. And when using stripped type embedding, whether to dot smooth factor on the network output of type embedding to keep the network smooth, instead of setting set_davg_zero to be True. Default value will be True in se_atten_v2 descriptor.

Raises:

ValueError: if ntypes is 0.

property explicit_ntypes: bool[source]: Explicit ntypes with type embedding.

compute_input_stats(data_coord: list, data_box: list, data_atype: list, natoms_vec: list, mesh: list, input_dict: dict, mixed_type: bool = False, real_natoms_vec: list | None = None, **kwargs) → None[source]

Compute the statisitcs (avg and std) of the training data. The input will be normalized by the statistics.

Parameters:

data_coord: The coordinates. Can be generated by deepmd.tf.model.make_stat_input
data_box: The box. Can be generated by deepmd.tf.model.make_stat_input
data_atype: The atom types. Can be generated by deepmd.tf.model.make_stat_input
natoms_vec: The vector for the number of atoms of the system and different types of atoms. If mixed_type is True, this para is blank. See real_natoms_vec.
mesh: The mesh for neighbor searching. Can be generated by deepmd.tf.model.make_stat_input
input_dict: Dictionary for additional input
mixed_type: Whether to perform the mixed_type mode. If True, the input data has the mixed_type format (see doc/model/train_se_atten.md), in which frames in a system may have different natoms_vec(s), with the same nloc.
real_natoms_vec: If mixed_type is True, it takes in the real natoms_vec for each frame.
**kwargs: Additional keyword arguments.

enable_compression(min_nbor_dist: float, graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, table_extrapolate: float = 5, table_stride_1: float = 0.01, table_stride_2: float = 0.1, check_frequency: int = -1, suffix: str = '') → None[source]

Reveive the statisitcs (distance, max_nbor_size and env_mat_range) of the training data.

Parameters:

min_nbor_dist: The nearest distance between atoms
graphtf.Graph: The graph of the model
graph_deftf.GraphDef: The graph_def of the model
table_extrapolate: The scale of model extrapolation
table_stride_1: The uniform stride of the first table
table_stride_2: The uniform stride of the second table
check_frequency: The overflow check frequency
suffixstr, optional: The suffix of the scope

build(coord_: deepmd.tf.env.tf.Tensor, atype_: deepmd.tf.env.tf.Tensor, natoms: deepmd.tf.env.tf.Tensor, box_: deepmd.tf.env.tf.Tensor, mesh: deepmd.tf.env.tf.Tensor, input_dict: dict, reuse: bool | None = None, suffix: str = '') → deepmd.tf.env.tf.Tensor[source]

Build the computational graph for the descriptor.

Parameters:

coord_: The coordinate of atoms
atype_: The type of atoms
natoms: The number of atoms. This tensor has the length of Ntypes + 2 natoms[0]: number of local atoms natoms[1]: total number of atoms held by this processor natoms[i]: 2 <= i < Ntypes+2, number of type i atoms
box_tf.Tensor: The box of the system
mesh: For historical reasons, only the length of the Tensor matters. if size of mesh == 6, pbc is assumed. if size of mesh == 0, no-pbc is assumed.
input_dict: Dictionary for additional inputs
reuse: The weights in the networks should be reused when get the variable.
suffix: Name suffix to identify this descriptor

Returns:

descriptor: The output descriptor

_pass_filter(inputs, atype, natoms, input_dict, reuse=None, suffix='', trainable=True)[source]

_compute_dstats_sys_smth(data_coord, data_box, data_atype, natoms_vec, mesh, mixed_type=False, real_natoms_vec=None)[source]

_lookup_type_embedding(xyz_scatter, natype, type_embedding)[source]

Concatenate type_embedding of neighbors and xyz_scatter. If not self.type_one_side, concatenate type_embedding of center atoms as well.

Parameters:

xyz_scatter:: shape is [nframes*natoms[0]*self.nnei, 1]
natype:: neighbor atom type
type_embedding:: shape is [self.ntypes, Y] where Y=jdata[‘type_embedding’][‘neuron’][-1]

Returns:

embedding:: environment of each atom represented by embedding.

_scaled_dot_attn(Q, K, V, temperature, input_r, dotr=False, do_mask=False, layer=0, save_weights=True)[source]

_attention_layers(input_xyz, layer_num, shape_i, outputs_size, input_r, dotr=False, do_mask=False, trainable=True, suffix='')[source]

_filter_lower(type_i, type_input, start_index, incrs_index, inputs, type_embedding=None, atype=None, is_exclude=False, activation_fn=None, bavg=0.0, stddev=1.0, trainable=True, suffix='', name='filter_', reuse=None)[source]: Input env matrix, returns R.G.

_filter(inputs, type_input, natoms, type_embedding=None, atype=None, activation_fn=tf.nn.tanh, stddev=1.0, bavg=0.0, suffix='', name='linear', reuse=None, trainable=True)[source]

init_variables(graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, suffix: str = '') → None[source]

Init the embedding net variables with the given dict.

Parameters:

graphtf.Graph: The input frozen model graph
graph_deftf.GraphDef: The input frozen model graph_def
suffixstr, optional: The suffix of the scope

build_type_exclude_mask_mixed(exclude_types: Set[Tuple[int, int]], ntypes: int, sel: List[int], ndescrpt: int, atype: deepmd.tf.env.tf.Tensor, shape0: deepmd.tf.env.tf.Tensor, nei_type_vec: deepmd.tf.env.tf.Tensor) → deepmd.tf.env.tf.Tensor[source]

Build the type exclude mask for the attention descriptor.

Parameters:

exclude_typesList[Tuple[int, int]]: The list of excluded types, e.g. [(0, 1), (1, 0)] means the interaction between type 0 and type 1 is excluded.
ntypesint: The number of types.
selList[int]: The list of the number of selected neighbors for each type.
ndescrptint: The number of descriptors for each atom.
atypetf.Tensor: The type of atoms, with the size of shape0.
shape0tf.Tensor: The shape of the first dimension of the inputs, which is equal to nsamples * natoms.
nei_type_vectf.Tensor: The type of neighbors, with the size of (shape0, nnei).

Returns:

tf.Tensor: The type exclude mask, with the shape of (shape0, ndescrpt), and the precision of GLOBAL_TF_FLOAT_PRECISION. The mask has the value of 1 if the interaction between two types is not excluded, and 0 otherwise.

See also

deepmd.tf.descriptor.descriptor.Descriptor.build_type_exclude_mask

Notes

This method has the similiar way to build the type exclude mask as deepmd.tf.descriptor.descriptor.Descriptor.build_type_exclude_mask(). The mathmatical expression has been explained in that method. The difference is that the attention descriptor has provided the type of the neighbors (idx_j) that is not in order, so we use it from an extra input.

classmethod update_sel(global_jdata: dict, local_jdata: dict)[source]

Update the selection and perform neighbor statistics.

Parameters:

global_jdatadict: The global data, containing the training section
local_jdatadict: The local data refer to the current class

serialize_attention_layers(nlayer: int, nnei: int, embed_dim: int, hidden_dim: int, dotr: bool, do_mask: bool, trainable_ln: bool, ln_eps: float, variables: dict, bias: bool = True, suffix: str = '') → dict[source]

classmethod deserialize_attention_layers(data: dict, suffix: str = '') → dict[source]

Deserialize attention layers.

Parameters:

datadict: The input attention layer data
suffixstr, optional: The suffix of the scope

Returns:

variablesdict: The input variables

classmethod deserialize(data: dict, suffix: str = '')[source]

Deserialize the model.

Parameters:

datadict: The serialized data

Returns:

Model: The deserialized model

serialize(suffix: str = '') → dict[source]

Serialize the model.

Parameters:

suffixstr, optional: The suffix of the scope

Returns:

dict: The serialized data

class deepmd.tf.descriptor.se_atten.DescrptDPA1Compat(rcut: float, rcut_smth: float, sel: List[int] | int, ntypes: int, neuron: List[int] = [25, 50, 100], axis_neuron: int = 8, tebd_dim: int = 8, tebd_input_mode: str = 'concat', resnet_dt: bool = False, trainable: bool = True, type_one_side: bool = True, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, exclude_types: List[List[int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = 'default', scaling_factor=1.0, normalize: bool = True, temperature: float | None = None, trainable_ln: bool = True, ln_eps: float | None = 0.001, smooth_type_embedding: bool = True, concat_output_tebd: bool = True, spin: Any | None = None, seed: int | None = None, uniform_seed: bool = False)[source]

Bases: DescrptSeAtten

Consistent version of the model for testing with other backend references.

This model includes the type_embedding as attributes and other additional parameters.

Parameters:

rcut: float: The cut-off radius \(r_c\)
rcut_smth: float: From where the environment matrix should be smoothed \(r_s\)
sel: list[int], int: list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius int: the total maxmum number of atoms in the cut-off radius
ntypes: int: Number of element types
neuron: list[int]: Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)
axis_neuron: int: Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)
tebd_dim: int: Dimension of the type embedding
tebd_input_mode: str: (Only support concat to keep consistent with other backend references.) The way to mix the type embeddings.
resnet_dt: bool: Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)
trainable: bool: If the weights of this descriptors are trainable.
trainable_ln: bool: Whether to use trainable shift and scale weights in layer normalization.
ln_eps: float, Optional: The epsilon value for layer normalization.
type_one_side: bool: If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
attn: int: Hidden dimension of the attention vectors
attn_layer: int: Number of attention layers
attn_dotr: bool: If dot the angular gate to the attention weights
attn_mask: bool: (Only support False to keep consistent with other backend references.) If mask the diagonal of attention weights
exclude_typesList[List[int]]: The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection: float: Protection parameter to prevent division by zero errors during environment matrix calculations.
set_davg_zero: bool: Set the shift of embedding net input to zero.
activation_function: str: The activation function in the embedding net. Supported options are “relu”, “tanh”, “none”, “linear”, “softplus”, “sigmoid”, “relu6”, “gelu”, “gelu_tf”.
precision: str: The precision of the embedding net parameters. Supported options are “float32”, “default”, “float16”, “float64”.
scaling_factor: float: (Only to keep consistent with other backend references.) (Not used in this version.) The scaling factor of normalization in calculations of attention weights. If temperature is None, the scaling of attention weights is (N_dim * scaling_factor)**0.5
normalize: bool: (Only support True to keep consistent with other backend references.) (Not used in this version.) Whether to normalize the hidden vectors in attention weights calculation.
temperature: float: (Only support 1.0 to keep consistent with other backend references.) (Not used in this version.) If not None, the scaling of attention weights is temperature itself.
smooth_type_embedding: bool: (Only support False to keep consistent with other backend references.) Whether to use smooth process in attention weights calculation.
concat_output_tebd: bool: Whether to concat type embedding at the output of the descriptor.
spin: (Only support None to keep consistent with old implementation.) The old implementation of deepspin.

build(coord_: deepmd.tf.env.tf.Tensor, atype_: deepmd.tf.env.tf.Tensor, natoms: deepmd.tf.env.tf.Tensor, box_: deepmd.tf.env.tf.Tensor, mesh: deepmd.tf.env.tf.Tensor, input_dict: dict, reuse: bool | None = None, suffix: str = '') → deepmd.tf.env.tf.Tensor[source]

Build the computational graph for the descriptor.

Parameters:

coord_: The coordinate of atoms
atype_: The type of atoms
natoms: The number of atoms. This tensor has the length of Ntypes + 2 natoms[0]: number of local atoms natoms[1]: total number of atoms held by this processor natoms[i]: 2 <= i < Ntypes+2, number of type i atoms
box_tf.Tensor: The box of the system
mesh: For historical reasons, only the length of the Tensor matters. if size of mesh == 6, pbc is assumed. if size of mesh == 0, no-pbc is assumed.
input_dict: Dictionary for additional inputs
reuse: The weights in the networks should be reused when get the variable.
suffix: Name suffix to identify this descriptor

Returns:

descriptor: The output descriptor

init_variables(graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, suffix: str = '') → None[source]

Init the embedding net variables with the given dict.

Parameters:

graphtf.Graph: The input frozen model graph
graph_deftf.GraphDef: The input frozen model graph_def
suffixstr, optional: The suffix of the scope

update_attention_layers_serialize(data: dict)[source]: Update the serialized data to be consistent with other backend references.

classmethod deserialize(data: dict, suffix: str = '')[source]

Deserialize the model.

Parameters:

datadict: The serialized data

Returns:

Model: The deserialized model

serialize(suffix: str = '') → dict[source]

Serialize the model.

Parameters:

suffixstr, optional: The suffix of the scope

Returns:

dict: The serialized data

deepmd.tf.descriptor.se_atten

Module Contents

Classes

Attributes

`deepmd.tf.descriptor.se_atten`