deepmd.tf.descriptor.se_atten#
Attributes#
Classes#
Smooth version descriptor with attention. | |
Consistent version of the model for testing with other backend references. |
Module Contents#
- class deepmd.tf.descriptor.se_atten.DescrptSeAtten(rcut: float, rcut_smth: float, sel: list[int] | int, ntypes: int, neuron: list[int] = [25, 50, 100], axis_neuron: int = 8, resnet_dt: bool = False, trainable: bool = True, seed: int | None = None, type_one_side: bool = True, set_davg_zero: bool = True, exclude_types: list[list[int]] = [], activation_function: str = 'tanh', precision: str = 'default', uniform_seed: bool = False, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, smooth_type_embedding: bool = False, tebd_input_mode: str = 'concat', scaling_factor=1.0, normalize=True, temperature=None, trainable_ln: bool = True, ln_eps: float | None = 0.001, concat_output_tebd: bool = True, env_protection: float = 0.0, stripped_type_embedding: bool | None = None, type_map: list[str] | None = None, **kwargs)[source]#
Bases:
deepmd.tf.descriptor.se_a.DescrptSeA
Smooth version descriptor with attention.
- Parameters:
- rcut: float
The cut-off radius \(r_c\)
- rcut_smth: float
From where the environment matrix should be smoothed \(r_s\)
- sel: list[int], int
list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius int: the total maxmum number of atoms in the cut-off radius
- neuron: list[int]
Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)
- axis_neuron: int
Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)
- resnet_dt: bool
Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)
- trainable: bool
If the weights of embedding net are trainable.
- seed: int, Optional
Random seed for initializing the network parameters.
- type_one_side: bool
If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
- exclude_types
list
[list
[int
]] The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- set_davg_zero: bool
Set the shift of embedding net input to zero.
- activation_function: str
The activation function in the embedding net. Supported options are “none”, “gelu_tf”, “linear”, “relu6”, “sigmoid”, “tanh”, “gelu”, “relu”, “softplus”.
- precision: str
The precision of the embedding net parameters. Supported options are “float16”, “float64”, “default”, “float32”.
- uniform_seed: bool
Only for the purpose of backward compatibility, retrieves the old behavior of using the random seed
- attn: int
The length of hidden vector during scale-dot attention computation.
- attn_layer: int
The number of layers in attention mechanism.
- attn_dotr: bool
Whether to dot the relative coordinates on the attention weights as a gated scheme.
- attn_mask: bool
Whether to mask the diagonal in the attention weights.
- ln_eps: float, Optional
The epsilon value for layer normalization.
- tebd_input_mode: str
The input mode of the type embedding. Supported modes are [“concat”, “strip”]. - “concat”: Concatenate the type embedding with the smoothed radial information as the union input for the embedding network. - “strip”: Use a separated embedding network for the type embedding and combine the output with the radial embedding network output. Default value will be strip in se_atten_v2 descriptor.
- smooth_type_embedding: bool
Whether to use smooth process in attention weights calculation. And when using stripped type embedding, whether to dot smooth factor on the network output of type embedding to keep the network smooth, instead of setting set_davg_zero to be True. Default value will be True in se_atten_v2 descriptor.
- stripped_type_embedding: bool, Optional
(Deprecated, kept only for compatibility.) Whether to strip the type embedding into a separate embedding network. Setting this parameter to True is equivalent to setting tebd_input_mode to ‘strip’. Setting it to False is equivalent to setting tebd_input_mode to ‘concat’. The default value is None, which means the tebd_input_mode setting will be used instead.
- type_map: list[str], Optional
A list of strings. Give the name to each type of atoms.
- Raises:
ValueError
if ntypes is 0.
- compute_input_stats(data_coord: list, data_box: list, data_atype: list, natoms_vec: list, mesh: list, input_dict: dict, mixed_type: bool = False, real_natoms_vec: list | None = None, **kwargs) None [source]#
Compute the statisitcs (avg and std) of the training data. The input will be normalized by the statistics.
- Parameters:
- data_coord
The coordinates. Can be generated by deepmd.tf.model.make_stat_input
- data_box
The box. Can be generated by deepmd.tf.model.make_stat_input
- data_atype
The atom types. Can be generated by deepmd.tf.model.make_stat_input
- natoms_vec
The vector for the number of atoms of the system and different types of atoms. If mixed_type is True, this para is blank. See real_natoms_vec.
- mesh
The mesh for neighbor searching. Can be generated by deepmd.tf.model.make_stat_input
- input_dict
Dictionary for additional input
- mixed_type
Whether to perform the mixed_type mode. If True, the input data has the mixed_type format (see doc/model/train_se_atten.md), in which frames in a system may have different natoms_vec(s), with the same nloc.
- real_natoms_vec
If mixed_type is True, it takes in the real natoms_vec for each frame.
- **kwargs
Additional keyword arguments.
- enable_compression(min_nbor_dist: float, graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, table_extrapolate: float = 5, table_stride_1: float = 0.01, table_stride_2: float = 0.1, check_frequency: int = -1, suffix: str = '', tebd_suffix: str = '') None [source]#
Receive the statisitcs (distance, max_nbor_size and env_mat_range) of the training data.
- Parameters:
- min_nbor_dist
The nearest distance between atoms
- graph
tf.Graph
The graph of the model
- graph_def
tf.GraphDef
The graph_def of the model
- table_extrapolate
The scale of model extrapolation
- table_stride_1
The uniform stride of the first table
- table_stride_2
The uniform stride of the second table
- check_frequency
The overflow check frequency
- suffix
str
,optional
The suffix of the scope
- tebd_suffix
str
,optional
The suffix of the type embedding scope, only for DescrptDPA1Compat
- build(coord_: deepmd.tf.env.tf.Tensor, atype_: deepmd.tf.env.tf.Tensor, natoms: deepmd.tf.env.tf.Tensor, box_: deepmd.tf.env.tf.Tensor, mesh: deepmd.tf.env.tf.Tensor, input_dict: dict, reuse: bool | None = None, suffix: str = '') deepmd.tf.env.tf.Tensor [source]#
Build the computational graph for the descriptor.
- Parameters:
- coord_
The coordinate of atoms
- atype_
The type of atoms
- natoms
The number of atoms. This tensor has the length of Ntypes + 2 natoms[0]: number of local atoms natoms[1]: total number of atoms held by this processor natoms[i]: 2 <= i < Ntypes+2, number of type i atoms
- box_
tf.Tensor
The box of the system
- mesh
For historical reasons, only the length of the Tensor matters. if size of mesh == 6, pbc is assumed. if size of mesh == 0, no-pbc is assumed.
- input_dict
Dictionary for additional inputs
- reuse
The weights in the networks should be reused when get the variable.
- suffix
Name suffix to identify this descriptor
- Returns:
descriptor
The output descriptor
- _compute_dstats_sys_smth(data_coord, data_box, data_atype, natoms_vec, mesh, mixed_type=False, real_natoms_vec=None)[source]#
- _lookup_type_embedding(xyz_scatter, natype, type_embedding)[source]#
Concatenate type_embedding of neighbors and xyz_scatter. If not self.type_one_side, concatenate type_embedding of center atoms as well.
- Parameters:
- xyz_scatter:
shape is [nframes*natoms[0]*self.nnei, 1]
- natype:
neighbor atom type
- type_embedding:
shape is [self.ntypes, Y] where Y=jdata[‘type_embedding’][‘neuron’][-1]
- Returns:
- embedding:
environment of each atom represented by embedding.
- _scaled_dot_attn(Q, K, V, temperature, input_r, dotr=False, do_mask=False, layer=0, save_weights=True)[source]#
- _attention_layers(input_xyz, layer_num, shape_i, outputs_size, input_r, dotr=False, do_mask=False, trainable=True, suffix='')[source]#
- _filter_lower(type_i, type_input, start_index, incrs_index, inputs, type_embedding=None, atype=None, is_exclude=False, activation_fn=None, bavg=0.0, stddev=1.0, trainable=True, suffix='', name='filter_', reuse=None)[source]#
Input env matrix, returns R.G.
- _filter(inputs, type_input, natoms, type_embedding=None, atype=None, activation_fn=tf.nn.tanh, stddev=1.0, bavg=0.0, suffix='', name='linear', reuse=None, trainable=True)[source]#
- init_variables(graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, suffix: str = '') None [source]#
Init the embedding net variables with the given dict.
- build_type_exclude_mask_mixed(exclude_types: set[tuple[int, int]], ntypes: int, sel: list[int], ndescrpt: int, atype: deepmd.tf.env.tf.Tensor, shape0: deepmd.tf.env.tf.Tensor, nei_type_vec: deepmd.tf.env.tf.Tensor) deepmd.tf.env.tf.Tensor [source]#
Build the type exclude mask for the attention descriptor.
- Parameters:
- exclude_types
list
[tuple
[int
,int
]] The list of excluded types, e.g. [(0, 1), (1, 0)] means the interaction between type 0 and type 1 is excluded.
- ntypes
int
The number of types.
- sel
list
[int
] The list of the number of selected neighbors for each type.
- ndescrpt
int
The number of descriptors for each atom.
- atype
tf.Tensor
The type of atoms, with the size of shape0.
- shape0
tf.Tensor
The shape of the first dimension of the inputs, which is equal to nsamples * natoms.
- nei_type_vec
tf.Tensor
The type of neighbors, with the size of (shape0, nnei).
- exclude_types
- Returns:
tf.Tensor
The type exclude mask, with the shape of (shape0, ndescrpt), and the precision of GLOBAL_TF_FLOAT_PRECISION. The mask has the value of 1 if the interaction between two types is not excluded, and 0 otherwise.
Notes
This method has the similar way to build the type exclude mask as
deepmd.tf.descriptor.descriptor.Descriptor.build_type_exclude_mask()
. The mathematical expression has been explained in that method. The difference is that the attention descriptor has provided the type of the neighbors (idx_j) that is not in order, so we use it from an extra input.
- classmethod update_sel(train_data: deepmd.utils.data_system.DeepmdDataSystem, type_map: list[str] | None, local_jdata: dict) tuple[dict, float | None] [source]#
Update the selection and perform neighbor statistics.
- serialize_attention_layers(nlayer: int, nnei: int, embed_dim: int, hidden_dim: int, dotr: bool, do_mask: bool, trainable_ln: bool, ln_eps: float, variables: dict, bias: bool = True, suffix: str = '') dict [source]#
- serialize_network_strip(ntypes: int, ndim: int, in_dim: int, neuron: list[int], activation_function: str, resnet_dt: bool, variables: dict, suffix: str = '', type_one_side: bool = False) dict [source]#
Serialize network.
- Parameters:
- ntypes
int
The number of types
- ndim
int
The dimension of elements
- in_dim
int
The input dimension
- neuron
list
[int
] The neuron list
- activation_function
str
The activation function
- resnet_dtbool
Whether to use resnet
- variables
dict
The input variables
- suffix
str
,optional
The suffix of the scope
- type_one_sidebool,
optional
If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
- ntypes
- Returns:
dict
The converted network data
- classmethod deserialize_attention_layers(data: dict, suffix: str = '') dict [source]#
Deserialize attention layers.
- classmethod deserialize_network_strip(data: dict, suffix: str = '', type_one_side: bool = False) dict [source]#
Deserialize network.
- Parameters:
- Returns:
- variables
dict
The input variables
- variables
- classmethod deserialize(data: dict, suffix: str = '')[source]#
Deserialize the model.
- Parameters:
- data
dict
The serialized data
- data
- Returns:
Model
The deserialized model
- class deepmd.tf.descriptor.se_atten.DescrptDPA1Compat(rcut: float, rcut_smth: float, sel: list[int] | int, ntypes: int, neuron: list[int] = [25, 50, 100], axis_neuron: int = 8, tebd_dim: int = 8, tebd_input_mode: str = 'concat', resnet_dt: bool = False, trainable: bool = True, type_one_side: bool = True, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, exclude_types: list[list[int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = 'default', scaling_factor=1.0, normalize: bool = True, temperature: float | None = None, trainable_ln: bool = True, ln_eps: float | None = 0.001, smooth_type_embedding: bool = True, concat_output_tebd: bool = True, use_econf_tebd: bool = False, use_tebd_bias: bool = False, type_map: list[str] | None = None, spin: Any | None = None, seed: int | None = None, uniform_seed: bool = False)[source]#
Bases:
DescrptSeAtten
Consistent version of the model for testing with other backend references.
This model includes the type_embedding as attributes and other additional parameters.
- Parameters:
- rcut: float
The cut-off radius \(r_c\)
- rcut_smth: float
From where the environment matrix should be smoothed \(r_s\)
- sel: list[int], int
list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius int: the total maxmum number of atoms in the cut-off radius
- ntypes: int
Number of element types
- neuron: list[int]
Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)
- axis_neuron: int
Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)
- tebd_dim: int
Dimension of the type embedding
- tebd_input_mode: str
The input mode of the type embedding. Supported modes are [“concat”, “strip”]. - “concat”: Concatenate the type embedding with the smoothed radial information as the union input for the embedding network. - “strip”: Use a separated embedding network for the type embedding and combine the output with the radial embedding network output.
- resnet_dt: bool
Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)
- trainable: bool
If the weights of this descriptors are trainable.
- trainable_ln: bool
Whether to use trainable shift and scale weights in layer normalization.
- ln_eps: float, Optional
The epsilon value for layer normalization.
- type_one_side: bool
If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
- attn: int
Hidden dimension of the attention vectors
- attn_layer: int
Number of attention layers
- attn_dotr: bool
If dot the angular gate to the attention weights
- attn_mask: bool
(Only support False to keep consistent with other backend references.) If mask the diagonal of attention weights
- exclude_types
list
[list
[int
]] The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection: float
Protection parameter to prevent division by zero errors during environment matrix calculations.
- set_davg_zero: bool
Set the shift of embedding net input to zero.
- activation_function: str
The activation function in the embedding net. Supported options are “none”, “gelu_tf”, “linear”, “relu6”, “sigmoid”, “tanh”, “gelu”, “relu”, “softplus”.
- precision: str
The precision of the embedding net parameters. Supported options are “float16”, “float64”, “default”, “float32”.
- scaling_factor: float
(Only to keep consistent with other backend references.) (Not used in this version.) The scaling factor of normalization in calculations of attention weights. If temperature is None, the scaling of attention weights is (N_dim * scaling_factor)**0.5
- normalize: bool
(Only support True to keep consistent with other backend references.) (Not used in this version.) Whether to normalize the hidden vectors in attention weights calculation.
- temperature: float
(Only support 1.0 to keep consistent with other backend references.) (Not used in this version.) If not None, the scaling of attention weights is temperature itself.
- smooth_type_embedding: bool
(Only support False to keep consistent with other backend references.) Whether to use smooth process in attention weights calculation.
- concat_output_tebd: bool
Whether to concat type embedding at the output of the descriptor.
- use_econf_tebd: bool, Optional
Whether to use electronic configuration type embedding.
- use_tebd_biasbool,
Optional
Whether to use bias in the type embedding layer.
- type_map: list[str], Optional
A list of strings. Give the name to each type of atoms.
- spin
(Only support None to keep consistent with old implementation.) The old implementation of deepspin.
- build(coord_: deepmd.tf.env.tf.Tensor, atype_: deepmd.tf.env.tf.Tensor, natoms: deepmd.tf.env.tf.Tensor, box_: deepmd.tf.env.tf.Tensor, mesh: deepmd.tf.env.tf.Tensor, input_dict: dict, reuse: bool | None = None, suffix: str = '') deepmd.tf.env.tf.Tensor [source]#
Build the computational graph for the descriptor.
- Parameters:
- coord_
The coordinate of atoms
- atype_
The type of atoms
- natoms
The number of atoms. This tensor has the length of Ntypes + 2 natoms[0]: number of local atoms natoms[1]: total number of atoms held by this processor natoms[i]: 2 <= i < Ntypes+2, number of type i atoms
- box_
tf.Tensor
The box of the system
- mesh
For historical reasons, only the length of the Tensor matters. if size of mesh == 6, pbc is assumed. if size of mesh == 0, no-pbc is assumed.
- input_dict
Dictionary for additional inputs
- reuse
The weights in the networks should be reused when get the variable.
- suffix
Name suffix to identify this descriptor
- Returns:
descriptor
The output descriptor
- enable_compression(min_nbor_dist: float, graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, table_extrapolate: float = 5, table_stride_1: float = 0.01, table_stride_2: float = 0.1, check_frequency: int = -1, suffix: str = '', tebd_suffix: str = '') None [source]#
Reveive the statisitcs (distance, max_nbor_size and env_mat_range) of the training data.
- Parameters:
- min_nbor_dist
The nearest distance between atoms
- graph
tf.Graph
The graph of the model
- graph_def
tf.GraphDef
The graph_def of the model
- table_extrapolate
The scale of model extrapolation
- table_stride_1
The uniform stride of the first table
- table_stride_2
The uniform stride of the second table
- check_frequency
The overflow check frequency
- suffix
str
,optional
The suffix of the scope
- tebd_suffix
str
,optional
Same as suffix.
- init_variables(graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, suffix: str = '') None [source]#
Init the embedding net variables with the given dict.
- update_attention_layers_serialize(data: dict)[source]#
Update the serialized data to be consistent with other backend references.