deepmd.tf.descriptor.se_atten

Module Contents

Classes

DescrptSeAtten

Smooth version descriptor with attention.

DescrptDPA1Compat

Consistent version of the model for testing with other backend references.

Attributes

log

deepmd.tf.descriptor.se_atten.log[source]
class deepmd.tf.descriptor.se_atten.DescrptSeAtten(rcut: float, rcut_smth: float, sel: List[int] | int, ntypes: int, neuron: List[int] = [25, 50, 100], axis_neuron: int = 8, resnet_dt: bool = False, trainable: bool = True, seed: int | None = None, type_one_side: bool = True, set_davg_zero: bool = True, exclude_types: List[List[int]] = [], activation_function: str = 'tanh', precision: str = 'default', uniform_seed: bool = False, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, multi_task: bool = False, stripped_type_embedding: bool = False, smooth_type_embedding: bool = False, scaling_factor=1.0, normalize=True, temperature=None, trainable_ln: bool = True, ln_eps: float | None = 0.001, concat_output_tebd: bool = True, env_protection: float = 0.0, **kwargs)[source]

Bases: deepmd.tf.descriptor.se_a.DescrptSeA

Smooth version descriptor with attention.

Parameters:
rcut: float

The cut-off radius \(r_c\)

rcut_smth: float

From where the environment matrix should be smoothed \(r_s\)

sel: list[int], int

list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius int: the total maxmum number of atoms in the cut-off radius

neuron: list[int]

Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)

axis_neuron: int

Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)

resnet_dt: bool

Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)

trainable: bool

If the weights of embedding net are trainable.

seed: int, Optional

Random seed for initializing the network parameters.

type_one_side: bool

Try to build N_types embedding nets. Otherwise, building N_types^2 embedding nets

exclude_typesList[List[int]]

The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.

set_davg_zero: bool

Set the shift of embedding net input to zero.

activation_function: str

The activation function in the embedding net. Supported options are “relu”, “tanh”, “none”, “linear”, “softplus”, “sigmoid”, “relu6”, “gelu”, “gelu_tf”.

precision: str

The precision of the embedding net parameters. Supported options are “float32”, “default”, “float16”, “float64”.

uniform_seed: bool

Only for the purpose of backward compatibility, retrieves the old behavior of using the random seed

attn: int

The length of hidden vector during scale-dot attention computation.

attn_layer: int

The number of layers in attention mechanism.

attn_dotr: bool

Whether to dot the relative coordinates on the attention weights as a gated scheme.

attn_mask: bool

Whether to mask the diagonal in the attention weights.

ln_eps: float, Optional

The epsilon value for layer normalization.

multi_task: bool

If the model has multi fitting nets to train.

stripped_type_embedding: bool

Whether to strip the type embedding into a separated embedding network. Default value will be True in se_atten_v2 descriptor.

smooth_type_embedding: bool

Whether to use smooth process in attention weights calculation. And when using stripped type embedding, whether to dot smooth factor on the network output of type embedding to keep the network smooth, instead of setting set_davg_zero to be True. Default value will be True in se_atten_v2 descriptor.

Raises:
ValueError

if ntypes is 0.

property explicit_ntypes: bool[source]

Explicit ntypes with type embedding.

compute_input_stats(data_coord: list, data_box: list, data_atype: list, natoms_vec: list, mesh: list, input_dict: dict, mixed_type: bool = False, real_natoms_vec: list | None = None, **kwargs) None[source]

Compute the statisitcs (avg and std) of the training data. The input will be normalized by the statistics.

Parameters:
data_coord

The coordinates. Can be generated by deepmd.tf.model.make_stat_input

data_box

The box. Can be generated by deepmd.tf.model.make_stat_input

data_atype

The atom types. Can be generated by deepmd.tf.model.make_stat_input

natoms_vec

The vector for the number of atoms of the system and different types of atoms. If mixed_type is True, this para is blank. See real_natoms_vec.

mesh

The mesh for neighbor searching. Can be generated by deepmd.tf.model.make_stat_input

input_dict

Dictionary for additional input

mixed_type

Whether to perform the mixed_type mode. If True, the input data has the mixed_type format (see doc/model/train_se_atten.md), in which frames in a system may have different natoms_vec(s), with the same nloc.

real_natoms_vec

If mixed_type is True, it takes in the real natoms_vec for each frame.

**kwargs

Additional keyword arguments.

enable_compression(min_nbor_dist: float, graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, table_extrapolate: float = 5, table_stride_1: float = 0.01, table_stride_2: float = 0.1, check_frequency: int = -1, suffix: str = '') None[source]

Reveive the statisitcs (distance, max_nbor_size and env_mat_range) of the training data.

Parameters:
min_nbor_dist

The nearest distance between atoms

graphtf.Graph

The graph of the model

graph_deftf.GraphDef

The graph_def of the model

table_extrapolate

The scale of model extrapolation

table_stride_1

The uniform stride of the first table

table_stride_2

The uniform stride of the second table

check_frequency

The overflow check frequency

suffixstr, optional

The suffix of the scope

build(coord_: deepmd.tf.env.tf.Tensor, atype_: deepmd.tf.env.tf.Tensor, natoms: deepmd.tf.env.tf.Tensor, box_: deepmd.tf.env.tf.Tensor, mesh: deepmd.tf.env.tf.Tensor, input_dict: dict, reuse: bool | None = None, suffix: str = '') deepmd.tf.env.tf.Tensor[source]

Build the computational graph for the descriptor.

Parameters:
coord_

The coordinate of atoms

atype_

The type of atoms

natoms

The number of atoms. This tensor has the length of Ntypes + 2 natoms[0]: number of local atoms natoms[1]: total number of atoms held by this processor natoms[i]: 2 <= i < Ntypes+2, number of type i atoms

box_tf.Tensor

The box of the system

mesh

For historical reasons, only the length of the Tensor matters. if size of mesh == 6, pbc is assumed. if size of mesh == 0, no-pbc is assumed.

input_dict

Dictionary for additional inputs

reuse

The weights in the networks should be reused when get the variable.

suffix

Name suffix to identify this descriptor

Returns:
descriptor

The output descriptor

_pass_filter(inputs, atype, natoms, input_dict, reuse=None, suffix='', trainable=True)[source]
_compute_dstats_sys_smth(data_coord, data_box, data_atype, natoms_vec, mesh, mixed_type=False, real_natoms_vec=None)[source]
_lookup_type_embedding(xyz_scatter, natype, type_embedding)[source]

Concatenate type_embedding of neighbors and xyz_scatter. If not self.type_one_side, concatenate type_embedding of center atoms as well.

Parameters:
xyz_scatter:

shape is [nframes*natoms[0]*self.nnei, 1]

natype:

neighbor atom type

type_embedding:

shape is [self.ntypes, Y] where Y=jdata[‘type_embedding’][‘neuron’][-1]

Returns:
embedding:

environment of each atom represented by embedding.

_scaled_dot_attn(Q, K, V, temperature, input_r, dotr=False, do_mask=False, layer=0, save_weights=True)[source]
_attention_layers(input_xyz, layer_num, shape_i, outputs_size, input_r, dotr=False, do_mask=False, trainable=True, suffix='')[source]
_filter_lower(type_i, type_input, start_index, incrs_index, inputs, type_embedding=None, atype=None, is_exclude=False, activation_fn=None, bavg=0.0, stddev=1.0, trainable=True, suffix='', name='filter_', reuse=None)[source]

Input env matrix, returns R.G.

_filter(inputs, type_input, natoms, type_embedding=None, atype=None, activation_fn=tf.nn.tanh, stddev=1.0, bavg=0.0, suffix='', name='linear', reuse=None, trainable=True)[source]
init_variables(graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, suffix: str = '') None[source]

Init the embedding net variables with the given dict.

Parameters:
graphtf.Graph

The input frozen model graph

graph_deftf.GraphDef

The input frozen model graph_def

suffixstr, optional

The suffix of the scope

build_type_exclude_mask_mixed(exclude_types: Set[Tuple[int, int]], ntypes: int, sel: List[int], ndescrpt: int, atype: deepmd.tf.env.tf.Tensor, shape0: deepmd.tf.env.tf.Tensor, nei_type_vec: deepmd.tf.env.tf.Tensor) deepmd.tf.env.tf.Tensor[source]

Build the type exclude mask for the attention descriptor.

Parameters:
exclude_typesList[Tuple[int, int]]

The list of excluded types, e.g. [(0, 1), (1, 0)] means the interaction between type 0 and type 1 is excluded.

ntypesint

The number of types.

selList[int]

The list of the number of selected neighbors for each type.

ndescrptint

The number of descriptors for each atom.

atypetf.Tensor

The type of atoms, with the size of shape0.

shape0tf.Tensor

The shape of the first dimension of the inputs, which is equal to nsamples * natoms.

nei_type_vectf.Tensor

The type of neighbors, with the size of (shape0, nnei).

Returns:
tf.Tensor

The type exclude mask, with the shape of (shape0, ndescrpt), and the precision of GLOBAL_TF_FLOAT_PRECISION. The mask has the value of 1 if the interaction between two types is not excluded, and 0 otherwise.

Notes

This method has the similiar way to build the type exclude mask as deepmd.tf.descriptor.descriptor.Descriptor.build_type_exclude_mask(). The mathmatical expression has been explained in that method. The difference is that the attention descriptor has provided the type of the neighbors (idx_j) that is not in order, so we use it from an extra input.

classmethod update_sel(global_jdata: dict, local_jdata: dict)[source]

Update the selection and perform neighbor statistics.

Parameters:
global_jdatadict

The global data, containing the training section

local_jdatadict

The local data refer to the current class

serialize_attention_layers(nlayer: int, nnei: int, embed_dim: int, hidden_dim: int, dotr: bool, do_mask: bool, trainable_ln: bool, ln_eps: float, variables: dict, bias: bool = True, suffix: str = '') dict[source]
classmethod deserialize_attention_layers(data: dict, suffix: str = '') dict[source]

Deserialize attention layers.

Parameters:
datadict

The input attention layer data

suffixstr, optional

The suffix of the scope

Returns:
variablesdict

The input variables

classmethod deserialize(data: dict, suffix: str = '')[source]

Deserialize the model.

Parameters:
datadict

The serialized data

Returns:
Model

The deserialized model

serialize(suffix: str = '') dict[source]

Serialize the model.

Parameters:
suffixstr, optional

The suffix of the scope

Returns:
dict

The serialized data

class deepmd.tf.descriptor.se_atten.DescrptDPA1Compat(rcut: float, rcut_smth: float, sel: List[int] | int, ntypes: int, neuron: List[int] = [25, 50, 100], axis_neuron: int = 8, tebd_dim: int = 8, tebd_input_mode: str = 'concat', resnet_dt: bool = False, trainable: bool = True, type_one_side: bool = True, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, exclude_types: List[List[int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = 'default', scaling_factor=1.0, normalize: bool = True, temperature: float | None = None, trainable_ln: bool = True, ln_eps: float | None = 0.001, smooth_type_embedding: bool = True, concat_output_tebd: bool = True, spin: Any | None = None, seed: int | None = None, uniform_seed: bool = False)[source]

Bases: DescrptSeAtten

Consistent version of the model for testing with other backend references.

This model includes the type_embedding as attributes and other additional parameters.

Parameters:
rcut: float

The cut-off radius \(r_c\)

rcut_smth: float

From where the environment matrix should be smoothed \(r_s\)

sel: list[int], int

list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius int: the total maxmum number of atoms in the cut-off radius

ntypes: int

Number of element types

neuron: list[int]

Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)

axis_neuron: int

Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)

tebd_dim: int

Dimension of the type embedding

tebd_input_mode: str

(Only support concat to keep consistent with other backend references.) The way to mix the type embeddings.

resnet_dt: bool

Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)

trainable: bool

If the weights of this descriptors are trainable.

trainable_ln: bool

Whether to use trainable shift and scale weights in layer normalization.

ln_eps: float, Optional

The epsilon value for layer normalization.

type_one_side: bool

If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.

attn: int

Hidden dimension of the attention vectors

attn_layer: int

Number of attention layers

attn_dotr: bool

If dot the angular gate to the attention weights

attn_mask: bool

(Only support False to keep consistent with other backend references.) If mask the diagonal of attention weights

exclude_typesList[List[int]]

The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.

env_protection: float

Protection parameter to prevent division by zero errors during environment matrix calculations.

set_davg_zero: bool

Set the shift of embedding net input to zero.

activation_function: str

The activation function in the embedding net. Supported options are “relu”, “tanh”, “none”, “linear”, “softplus”, “sigmoid”, “relu6”, “gelu”, “gelu_tf”.

precision: str

The precision of the embedding net parameters. Supported options are “float32”, “default”, “float16”, “float64”.

scaling_factor: float

(Only to keep consistent with other backend references.) (Not used in this version.) The scaling factor of normalization in calculations of attention weights. If temperature is None, the scaling of attention weights is (N_dim * scaling_factor)**0.5

normalize: bool

(Only support True to keep consistent with other backend references.) (Not used in this version.) Whether to normalize the hidden vectors in attention weights calculation.

temperature: float

(Only support 1.0 to keep consistent with other backend references.) (Not used in this version.) If not None, the scaling of attention weights is temperature itself.

smooth_type_embedding: bool

(Only support False to keep consistent with other backend references.) Whether to use smooth process in attention weights calculation.

concat_output_tebd: bool

Whether to concat type embedding at the output of the descriptor.

spin

(Only support None to keep consistent with old implementation.) The old implementation of deepspin.

build(coord_: deepmd.tf.env.tf.Tensor, atype_: deepmd.tf.env.tf.Tensor, natoms: deepmd.tf.env.tf.Tensor, box_: deepmd.tf.env.tf.Tensor, mesh: deepmd.tf.env.tf.Tensor, input_dict: dict, reuse: bool | None = None, suffix: str = '') deepmd.tf.env.tf.Tensor[source]

Build the computational graph for the descriptor.

Parameters:
coord_

The coordinate of atoms

atype_

The type of atoms

natoms

The number of atoms. This tensor has the length of Ntypes + 2 natoms[0]: number of local atoms natoms[1]: total number of atoms held by this processor natoms[i]: 2 <= i < Ntypes+2, number of type i atoms

box_tf.Tensor

The box of the system

mesh

For historical reasons, only the length of the Tensor matters. if size of mesh == 6, pbc is assumed. if size of mesh == 0, no-pbc is assumed.

input_dict

Dictionary for additional inputs

reuse

The weights in the networks should be reused when get the variable.

suffix

Name suffix to identify this descriptor

Returns:
descriptor

The output descriptor

init_variables(graph: deepmd.tf.env.tf.Graph, graph_def: deepmd.tf.env.tf.GraphDef, suffix: str = '') None[source]

Init the embedding net variables with the given dict.

Parameters:
graphtf.Graph

The input frozen model graph

graph_deftf.GraphDef

The input frozen model graph_def

suffixstr, optional

The suffix of the scope

update_attention_layers_serialize(data: dict)[source]

Update the serialized data to be consistent with other backend references.

classmethod deserialize(data: dict, suffix: str = '')[source]

Deserialize the model.

Parameters:
datadict

The serialized data

Returns:
Model

The deserialized model

serialize(suffix: str = '') dict[source]

Serialize the model.

Parameters:
suffixstr, optional

The suffix of the scope

Returns:
dict

The serialized data