Training Parameters

5.4. Training Parameters#

Note

One can load, modify, and export the input file by using our effective web-based tool DP-GUI online or hosted using the command line interface dp gui. All training parameters below can be set in DP-GUI. By clicking “SAVE JSON”, one can download the input file for further training.

Note

One can benefit from IntelliSense and validation when writing JSON files using Visual Studio Code. See here to learn how to configure.

model:#

type: dict
argument path: model
type_map:#
type: list[str], optional
argument path: model/type_map
A list of strings. Give the name to each type of atoms. It is noted that the number of atom type of training system must be less than 128 in a GPU environment. If not given, type.raw in each system should use the same type indexes, and type_map.raw will take no effect.
data_stat_nbatch:#
type: int, optional, default: 10
argument path: model/data_stat_nbatch
The model determines the normalization from the statistics of the data. This key specifies the number of frames in each system used for statistics.
data_stat_protect:#
type: float, optional, default: 0.01
argument path: model/data_stat_protect
Protect parameter for atomic energy regression.
data_bias_nsample:#
type: int, optional, default: 10
argument path: model/data_bias_nsample
The number of training samples in a system to compute and change the energy bias.
use_srtab:#
type: str, optional
argument path: model/use_srtab
The table for the short-range pairwise interaction added on top of DP. The table is a text data file with (N_t + 1) * N_t / 2 + 1 columes. The first colume is the distance between atoms. The second to the last columes are energies for pairs of certain types. For example we have two atom types, 0 and 1. The columes from 2nd to 4th are for 0-0, 0-1 and 1-1 correspondingly.
smin_alpha:#
type: float, optional
argument path: model/smin_alpha
The short-range tabulated interaction will be switched according to the distance of the nearest neighbor. This distance is calculated by softmin. This parameter is the decaying parameter in the softmin. It is only required when use_srtab is provided.
sw_rmin:#
type: float, optional
argument path: model/sw_rmin
The lower boundary of the interpolation between short-range tabulated interaction and DP. It is only required when use_srtab is provided.
sw_rmax:#
type: float, optional
argument path: model/sw_rmax
The upper boundary of the interpolation between short-range tabulated interaction and DP. It is only required when use_srtab is provided.
pair_exclude_types:#
type: list, optional, default: []
argument path: model/pair_exclude_types
(Supported Backend: PyTorch) The atom pairs of the listed types are not treated to be neighbors, i.e. they do not see each other.
atom_exclude_types:#
type: list, optional, default: []
argument path: model/atom_exclude_types
(Supported Backend: PyTorch) Exclude the atomic contribution of the listed atom types
preset_out_bias:#
type: NoneType | dict[str, list[float | list[float] | None]], optional, default: None
argument path: model/preset_out_bias
(Supported Backend: PyTorch) The preset bias of the atomic output. Note that the set_davg_zero should be set to true. The bias is provided as a dict. Taking the energy model that has three atom types for example, the preset_out_bias may be given as { ‘energy’: [null, 0., 1.] }. In this case the energy bias of type 1 and 2 are set to 0. and 1., respectively. A dipole model with two atom types may set preset_out_bias as { ‘dipole’: [null, [0., 1., 2.]] }
srtab_add_bias:#
type: bool, optional, default: True
argument path: model/srtab_add_bias
(Supported Backend: TensorFlow) Whether add energy bias from the statistics of the data to short-range tabulated atomic energy. It only takes effect when use_srtab is provided.
type_embedding:#
type: dict, optional
argument path: model/type_embedding
(Supported Backend: TensorFlow) The type embedding. In other backends, the type embedding is already included in the descriptor.
neuron:#
type: list[int], optional, default: [8]
argument path: model/type_embedding/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
activation_function:#
type: str, optional, default: tanh
argument path: model/type_embedding/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model/type_embedding/resnet_dt
Whether to use a “Timestep” in the skip connection
precision:#
type: str, optional, default: default
argument path: model/type_embedding/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model/type_embedding/trainable
If the parameters in the embedding net are trainable
seed:#
type: NoneType | int, optional, default: None
argument path: model/type_embedding/seed
Random seed for parameter initialization
use_econf_tebd:#
type: bool, optional, default: False
argument path: model/type_embedding/use_econf_tebd
Whether to use electronic configuration type embedding.
use_tebd_bias:#
type: bool, optional, default: False
argument path: model/type_embedding/use_tebd_bias
Whether to use bias in the type embedding layer.
modifier:#
type: dict, optional
argument path: model/modifier
(Supported Backend: TensorFlow) The modifier of model output.
Depending on the value of type, different sub args are accepted.
type:#
type: str (flag key)
argument path: model/modifier/type
possible choices: dipole_charge
The type of modifier.
dipole_charge: Use WFCC to model the electronic structure of the system. Correct the long-range interaction.
When type is set to dipole_charge:
Use WFCC to model the electronic structure of the system. Correct the long-range interaction.
model_name:#
type: str
argument path: model/modifier[dipole_charge]/model_name
The name of the frozen dipole model file.
model_charge_map:#
type: list[float]
argument path: model/modifier[dipole_charge]/model_charge_map
The charge of the WFCC. The list length should be the same as the sel_type.
sys_charge_map:#
type: list[float]
argument path: model/modifier[dipole_charge]/sys_charge_map
The charge of real atoms. The list length should be the same as the type_map
ewald_beta:#
type: float, optional, default: 0.4
argument path: model/modifier[dipole_charge]/ewald_beta
The splitting parameter of Ewald sum. Unit is A^-1
ewald_h:#
type: float, optional, default: 1.0
argument path: model/modifier[dipole_charge]/ewald_h
The grid spacing of the FFT grid. Unit is A
compress:#
type: dict, optional
argument path: model/compress
(Supported Backend: TensorFlow) Model compression configurations
spin:#
type: dict, optional
argument path: model/spin
The settings for systems with spin.
use_spin:#
type: list[bool] | list[int]
argument path: model/spin/use_spin
Whether to use atomic spin model for each atom type. List of boolean values with the shape of [ntypes] to specify which types use spin, or a list of integer values (Supported Backend: PyTorch) to indicate the index of the type that uses spin.
spin_norm:#
type: list[float], optional
argument path: model/spin/spin_norm
(Supported Backend: TensorFlow) The magnitude of atomic spin for each atom type with spin
virtual_len:#
type: list[float], optional
argument path: model/spin/virtual_len
(Supported Backend: TensorFlow) The distance between virtual atom representing spin and its corresponding real atom for each atom type with spin
virtual_scale:#
type: float | list[float], optional
argument path: model/spin/virtual_scale
(Supported Backend: PyTorch) The scaling factor to determine the virtual distance between a virtual atom representing spin and its corresponding real atom for each atom type with spin. This factor is defined as the virtual distance divided by the magnitude of atomic spin for each atom type with spin. The virtual coordinate is defined as the real coordinate plus spin * virtual_scale. List of float values with shape of [ntypes] or [ntypes_spin] or one single float value for all types, only used when use_spin is True for each atom type.
finetune_head:#
type: str, optional
argument path: model/finetune_head
(Supported Backend: PyTorch) The chosen fitting net to fine-tune on, when doing multi-task fine-tuning. If not set or set to ‘RANDOM’, the fitting net will be randomly initialized.
Depending on the value of type, different sub args are accepted.
type:#
type: str (flag key), default: standard
argument path: model/type
possible choices: standard, frozen, pairtab, pairwise_dprc, linear_ener
standard: Standard model, which contains a descriptor and a fitting.
pairtab: (Supported Backend: TensorFlow) Pairwise tabulation energy model.
pairwise_dprc: (Supported Backend: TensorFlow)
linear_ener: (Supported Backend: TensorFlow)
When type is set to standard:
Standard model, which contains a descriptor and a fitting.
descriptor:#
type: dict
argument path: model[standard]/descriptor
The descriptor of atomic environment.
Depending on the value of type, different sub args are accepted.
type:#
type: str (flag key)
argument path: model[standard]/descriptor/type
possible choices: loc_frame, se_e2_a, se_e3, se_a_tpe, se_e2_r, hybrid, se_atten, se_e3_tebd, se_atten_v2, dpa2, dpa3, se_a_ebd_v2, se_a_mask
The type of the descriptor.
loc_frame: (Supported Backend: TensorFlow) Defines a local frame at each atom, and the compute the descriptor as local coordinates under this frame.
se_e2_a: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor.
se_e3: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Three-body embedding will be used by this descriptor.
se_a_tpe: (Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Type embedding will be used by this descriptor.
se_e2_r: Used by the smooth edition of Deep Potential. Only the distance between atoms is used to construct the descriptor.
hybrid: Concatenate of a list of descriptors as a new descriptor.
se_atten: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism will be used by this descriptor.
se_e3_tebd: (Supported Backend: PyTorch)
se_atten_v2: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism with new modifications will be used by this descriptor.
dpa2: (Supported Backend: PyTorch)
dpa3: (Supported Backend: PyTorch)
se_a_ebd_v2: (Supported Backend: TensorFlow)
se_a_mask: (Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. It can accept a variable number of atoms in a frame (Non-PBC system). aparam are required as an indicator matrix for the real/virtual sign of input atoms.
When type is set to loc_frame:
(Supported Backend: TensorFlow) Defines a local frame at each atom, and the compute the descriptor as local coordinates under this frame.
sel_a:#
type: list[int]
argument path: model[standard]/descriptor[loc_frame]/sel_a
A list of integers. The length of the list should be the same as the number of atom types in the system. sel_a[i] gives the selected number of type-i neighbors. The full relative coordinates of the neighbors are used by the descriptor.
sel_r:#
type: list[int]
argument path: model[standard]/descriptor[loc_frame]/sel_r
A list of integers. The length of the list should be the same as the number of atom types in the system. sel_r[i] gives the selected number of type-i neighbors. Only relative distance of the neighbors are used by the descriptor. sel_a[i] + sel_r[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[loc_frame]/rcut
The cut-off radius. The default value is 6.0
axis_rule:#
type: list[int]
argument path: model[standard]/descriptor[loc_frame]/axis_rule
A list of integers. The length should be 6 times of the number of types.
axis_rule[i*6+0]: class of the atom defining the first axis of type-i atom. 0 for neighbors with full coordinates and 1 for neighbors only with relative distance.
axis_rule[i*6+1]: type of the atom defining the first axis of type-i atom.
axis_rule[i*6+2]: index of the axis atom defining the first axis. Note that the neighbors with the same class and type are sorted according to their relative distance.
axis_rule[i*6+3]: class of the atom defining the second axis of type-i atom. 0 for neighbors with full coordinates and 1 for neighbors only with relative distance.
axis_rule[i*6+4]: type of the atom defining the second axis of type-i atom.
axis_rule[i*6+5]: index of the axis atom defining the second axis. Note that the neighbors with the same class and type are sorted according to their relative distance.
When type is set to se_e2_a (or its alias se_a):
Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor.
sel:#
type: list[int] | str, optional, default: auto
argument path: model[standard]/descriptor[se_e2_a]/sel
This parameter set the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[se_e2_a]/rcut
The cut-off radius.
rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[se_e2_a]/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_e2_a]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
axis_neuron:#
type: int, optional, default: 4, alias: n_axis_neuron
argument path: model[standard]/descriptor[se_e2_a]/axis_neuron
Size of the submatrix of G (embedding matrix).
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_e2_a]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e2_a]/resnet_dt
Whether to use a “Timestep” in the skip connection
type_one_side:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e2_a]/type_one_side
If true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_e2_a]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_e2_a]/trainable
If the parameters in the embedding net is trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_e2_a]/seed
Random seed for parameter initialization
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_e2_a]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[se_e2_a]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
set_davg_zero:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e2_a]/set_davg_zero
Set the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
When type is set to se_e3 (or its aliases se_at, se_a_3be, se_t):
Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Three-body embedding will be used by this descriptor.
sel:#
type: list[int] | str, optional, default: auto
argument path: model[standard]/descriptor[se_e3]/sel
This parameter set the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[se_e3]/rcut
The cut-off radius.
rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[se_e3]/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_e3]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_e3]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e3]/resnet_dt
Whether to use a “Timestep” in the skip connection
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_e3]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_e3]/trainable
If the parameters in the embedding net are trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_e3]/seed
Random seed for parameter initialization
set_davg_zero:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e3]/set_davg_zero
Set the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_e3]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[se_e3]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
When type is set to se_a_tpe (or its alias se_a_ebd):
(Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Type embedding will be used by this descriptor.
sel:#
type: list[int] | str, optional, default: auto
argument path: model[standard]/descriptor[se_a_tpe]/sel
This parameter set the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[se_a_tpe]/rcut
The cut-off radius.
rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[se_a_tpe]/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_a_tpe]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
axis_neuron:#
type: int, optional, default: 4, alias: n_axis_neuron
argument path: model[standard]/descriptor[se_a_tpe]/axis_neuron
Size of the submatrix of G (embedding matrix).
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_a_tpe]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_a_tpe]/resnet_dt
Whether to use a “Timestep” in the skip connection
type_one_side:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_a_tpe]/type_one_side
If true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_a_tpe]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_a_tpe]/trainable
If the parameters in the embedding net is trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_a_tpe]/seed
Random seed for parameter initialization
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_a_tpe]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[se_a_tpe]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
set_davg_zero:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_a_tpe]/set_davg_zero
Set the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
type_nchanl:#
type: int, optional, default: 4
argument path: model[standard]/descriptor[se_a_tpe]/type_nchanl
number of channels for type embedding
type_nlayer:#
type: int, optional, default: 2
argument path: model[standard]/descriptor[se_a_tpe]/type_nlayer
number of hidden layers of type embedding net
numb_aparam:#
type: int, optional, default: 0
argument path: model[standard]/descriptor[se_a_tpe]/numb_aparam
dimension of atomic parameter. if set to a value > 0, the atomic parameters are embedded.
When type is set to se_e2_r (or its alias se_r):
Used by the smooth edition of Deep Potential. Only the distance between atoms is used to construct the descriptor.
sel:#
type: list[int] | str, optional, default: auto
argument path: model[standard]/descriptor[se_e2_r]/sel
This parameter set the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[se_e2_r]/rcut
The cut-off radius.
rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[se_e2_r]/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_e2_r]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_e2_r]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e2_r]/resnet_dt
Whether to use a “Timestep” in the skip connection
type_one_side:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e2_r]/type_one_side
If true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_e2_r]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_e2_r]/trainable
If the parameters in the embedding net are trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_e2_r]/seed
Random seed for parameter initialization
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_e2_r]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
set_davg_zero:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e2_r]/set_davg_zero
Set the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[se_e2_r]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
When type is set to hybrid:
Concatenate of a list of descriptors as a new descriptor.
list:#
type: list
argument path: model[standard]/descriptor[hybrid]/list
A list of descriptor definitions
When type is set to se_atten (or its alias dpa1):
Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism will be used by this descriptor.
sel:#
type: list[int] | str | int, optional, default: auto
argument path: model[standard]/descriptor[se_atten]/sel
This parameter set the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[se_atten]/rcut
The cut-off radius.
rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[se_atten]/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_atten]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
axis_neuron:#
type: int, optional, default: 4, alias: n_axis_neuron
argument path: model[standard]/descriptor[se_atten]/axis_neuron
Size of the submatrix of G (embedding matrix).
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_atten]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten]/resnet_dt
Whether to use a “Timestep” in the skip connection
type_one_side:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten]/type_one_side
If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_atten]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten]/trainable
If the parameters in the embedding net is trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_atten]/seed
Random seed for parameter initialization
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_atten]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[se_atten]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
attn:#
type: int, optional, default: 128
argument path: model[standard]/descriptor[se_atten]/attn
The length of hidden vectors in attention layers
attn_layer:#
type: int, optional, default: 2
argument path: model[standard]/descriptor[se_atten]/attn_layer
The number of attention layers. Note that model compression of se_atten works for any attn_layer value (for pytorch backend only, for other backends, attn_layer=0 is still needed to compress) when tebd_input_mode==’strip’. When attn_layer!=0, only type embedding is compressed, geometric parts are not compressed.
attn_dotr:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten]/attn_dotr
Whether to do dot product with the normalized relative coordinates
attn_mask:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten]/attn_mask
Whether to do mask on the diagonal in the attention matrix
stripped_type_embedding:#
type: bool | NoneType, optional, default: None
argument path: model[standard]/descriptor[se_atten]/stripped_type_embedding
(Deprecated, kept only for compatibility.) Whether to strip the type embedding into a separate embedding network. Setting this parameter to True is equivalent to setting tebd_input_mode to ‘strip’. Setting it to False is equivalent to setting tebd_input_mode to ‘concat’.The default value is None, which means the tebd_input_mode setting will be used instead.
smooth_type_embedding:#
type: bool, optional, default: False, alias: smooth_type_embdding
argument path: model[standard]/descriptor[se_atten]/smooth_type_embedding
Whether to use smooth process in attention weights calculation. (Supported Backend: TensorFlow) When using stripped type embedding, whether to dot smooth factor on the network output of type embedding to keep the network smooth, instead of setting set_davg_zero to be True.
set_davg_zero:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten]/set_davg_zero
Set the normalization average to zero. This option should be set when se_atten descriptor or atom_ener in the energy fitting is used
trainable_ln:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten]/trainable_ln
Whether to use trainable shift and scale weights in layer normalization.
ln_eps:#
type: float | NoneType, optional, default: None
argument path: model[standard]/descriptor[se_atten]/ln_eps
The epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
tebd_dim:#
type: int, optional, default: 8
argument path: model[standard]/descriptor[se_atten]/tebd_dim
(Supported Backend: PyTorch) The dimension of atom type embedding.
use_econf_tebd:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten]/use_econf_tebd
(Supported Backend: PyTorch) Whether to use electronic configuration type embedding. For TensorFlow backend, please set use_econf_tebd in type_embedding block instead.
use_tebd_bias:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten]/use_tebd_bias
Whether to use bias in the type embedding layer.
tebd_input_mode:#
type: str, optional, default: concat
argument path: model[standard]/descriptor[se_atten]/tebd_input_mode
The input mode of the type embedding. Supported modes are [‘concat’, ‘strip’].- ‘concat’: Concatenate the type embedding with the smoothed radial information as the union input for the embedding network. When type_one_side is False, the input is input_ij = concat([r_ij, tebd_j, tebd_i]). When type_one_side is True, the input is input_ij = concat([r_ij, tebd_j]). The output is out_ij = embedding(input_ij) for the pair-wise representation of atom i with neighbor j.- ‘strip’: Use a separated embedding network for the type embedding and combine the output with the radial embedding network output. When type_one_side is False, the input is input_t = concat([tebd_j, tebd_i]). (Supported Backend: PyTorch) When type_one_side is True, the input is input_t = tebd_j. The output is out_ij = embeding_t(input_t) * embeding_s(r_ij) + embeding_s(r_ij) for the pair-wise representation of atom i with neighbor j.
scaling_factor:#
type: float, optional, default: 1.0
argument path: model[standard]/descriptor[se_atten]/scaling_factor
(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K). If temperature is None, the scaling of attention weights is (N_hidden_dim * scaling_factor)**0.5. Else, the scaling of attention weights is setting to temperature.
normalize:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten]/normalize
(Supported Backend: PyTorch) Whether to normalize the hidden vectors during attention calculation.
temperature:#
type: float, optional
argument path: model[standard]/descriptor[se_atten]/temperature
(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K).
concat_output_tebd:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten]/concat_output_tebd
(Supported Backend: PyTorch) Whether to concat type embedding at the output of the descriptor.
When type is set to se_e3_tebd:
(Supported Backend: PyTorch)
sel:#
type: list[int] | str | int, optional, default: auto
argument path: model[standard]/descriptor[se_e3_tebd]/sel
This parameter set the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[se_e3_tebd]/rcut
The cut-off radius.
rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[se_e3_tebd]/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_e3_tebd]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
tebd_dim:#
type: int, optional, default: 8
argument path: model[standard]/descriptor[se_e3_tebd]/tebd_dim
(Supported Backend: PyTorch) The dimension of atom type embedding.
tebd_input_mode:#
type: str, optional, default: concat
argument path: model[standard]/descriptor[se_e3_tebd]/tebd_input_mode
The input mode of the type embedding. Supported modes are [‘concat’, ‘strip’].- ‘concat’: Concatenate the type embedding with the smoothed angular information as the union input for the embedding network. The input is input_jk = concat([angle_jk, tebd_j, tebd_k]). The output is out_jk = embedding(input_jk) for the three-body representation of atom i with neighbors j and k.- ‘strip’: Use a separated embedding network for the type embedding and combine the output with the angular embedding network output. The input is input_t = concat([tebd_j, tebd_k]).The output is out_jk = embeding_t(input_t) * embeding_s(angle_jk) + embeding_s(angle_jk) for the three-body representation of atom i with neighbors j and k.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e3_tebd]/resnet_dt
Whether to use a “Timestep” in the skip connection
set_davg_zero:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_e3_tebd]/set_davg_zero
Set the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_e3_tebd]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[se_e3_tebd]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
smooth:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_e3_tebd]/smooth
Whether to use smooth process in calculation when using stripped type embedding. Whether to dot smooth factor (both neighbors j and k) on the network output (out_jk) of type embedding to keep the network smooth, instead of setting set_davg_zero to be True.
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_e3_tebd]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_e3_tebd]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_e3_tebd]/trainable
If the parameters in the embedding net is trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_e3_tebd]/seed
Random seed for parameter initialization
concat_output_tebd:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_e3_tebd]/concat_output_tebd
(Supported Backend: PyTorch) Whether to concat type embedding at the output of the descriptor.
use_econf_tebd:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_e3_tebd]/use_econf_tebd
(Supported Backend: PyTorch) Whether to use electronic configuration type embedding.
use_tebd_bias:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_e3_tebd]/use_tebd_bias
When type is set to se_atten_v2:
Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism with new modifications will be used by this descriptor.
sel:#
type: list[int] | str | int, optional, default: auto
argument path: model[standard]/descriptor[se_atten_v2]/sel
This parameter set the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[se_atten_v2]/rcut
The cut-off radius.
rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[se_atten_v2]/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_atten_v2]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
axis_neuron:#
type: int, optional, default: 4, alias: n_axis_neuron
argument path: model[standard]/descriptor[se_atten_v2]/axis_neuron
Size of the submatrix of G (embedding matrix).
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_atten_v2]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten_v2]/resnet_dt
Whether to use a “Timestep” in the skip connection
type_one_side:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten_v2]/type_one_side
If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_atten_v2]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten_v2]/trainable
If the parameters in the embedding net is trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_atten_v2]/seed
Random seed for parameter initialization
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_atten_v2]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[se_atten_v2]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
attn:#
type: int, optional, default: 128
argument path: model[standard]/descriptor[se_atten_v2]/attn
The length of hidden vectors in attention layers
attn_layer:#
type: int, optional, default: 2
argument path: model[standard]/descriptor[se_atten_v2]/attn_layer
The number of attention layers. Note that model compression of se_atten works for any attn_layer value (for pytorch backend only, for other backends, attn_layer=0 is still needed to compress) when tebd_input_mode==’strip’. When attn_layer!=0, only type embedding is compressed, geometric parts are not compressed.
attn_dotr:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten_v2]/attn_dotr
Whether to do dot product with the normalized relative coordinates
attn_mask:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten_v2]/attn_mask
Whether to do mask on the diagonal in the attention matrix
set_davg_zero:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten_v2]/set_davg_zero
Set the normalization average to zero. This option should be set when se_atten descriptor or atom_ener in the energy fitting is used
trainable_ln:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten_v2]/trainable_ln
Whether to use trainable shift and scale weights in layer normalization.
ln_eps:#
type: float | NoneType, optional, default: None
argument path: model[standard]/descriptor[se_atten_v2]/ln_eps
The epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
tebd_dim:#
type: int, optional, default: 8
argument path: model[standard]/descriptor[se_atten_v2]/tebd_dim
(Supported Backend: PyTorch) The dimension of atom type embedding.
use_econf_tebd:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten_v2]/use_econf_tebd
(Supported Backend: PyTorch) Whether to use electronic configuration type embedding. For TensorFlow backend, please set use_econf_tebd in type_embedding block instead.
use_tebd_bias:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_atten_v2]/use_tebd_bias
Whether to use bias in the type embedding layer.
scaling_factor:#
type: float, optional, default: 1.0
argument path: model[standard]/descriptor[se_atten_v2]/scaling_factor
(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K). If temperature is None, the scaling of attention weights is (N_hidden_dim * scaling_factor)**0.5. Else, the scaling of attention weights is setting to temperature.
normalize:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten_v2]/normalize
(Supported Backend: PyTorch) Whether to normalize the hidden vectors during attention calculation.
temperature:#
type: float, optional
argument path: model[standard]/descriptor[se_atten_v2]/temperature
(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K).
concat_output_tebd:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_atten_v2]/concat_output_tebd
(Supported Backend: PyTorch) Whether to concat type embedding at the output of the descriptor.
When type is set to dpa2:
(Supported Backend: PyTorch)
repinit:#
type: dict
argument path: model[standard]/descriptor[dpa2]/repinit
The arguments used to initialize the repinit block.
rcut:#
type: float
argument path: model[standard]/descriptor[dpa2]/repinit/rcut
The cut-off radius.
rcut_smth:#
type: float
argument path: model[standard]/descriptor[dpa2]/repinit/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth.
nsel:#
type: str | int
argument path: model[standard]/descriptor[dpa2]/repinit/nsel
Maximally possible number of selected neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
neuron:#
type: list, optional, default: [25, 50, 100]
argument path: model[standard]/descriptor[dpa2]/repinit/neuron
Number of neurons in each hidden layers of the embedding net.When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
axis_neuron:#
type: int, optional, default: 16
argument path: model[standard]/descriptor[dpa2]/repinit/axis_neuron
Size of the submatrix of G (embedding matrix).
tebd_dim:#
type: int, optional, default: 8
argument path: model[standard]/descriptor[dpa2]/repinit/tebd_dim
The dimension of atom type embedding.
tebd_input_mode:#
type: str, optional, default: concat
argument path: model[standard]/descriptor[dpa2]/repinit/tebd_input_mode
The input mode of the type embedding. Supported modes are [‘concat’, ‘strip’].- ‘concat’: Concatenate the type embedding with the smoothed radial information as the union input for the embedding network. When type_one_side is False, the input is input_ij = concat([r_ij, tebd_j, tebd_i]). When type_one_side is True, the input is input_ij = concat([r_ij, tebd_j]). The output is out_ij = embedding(input_ij) for the pair-wise representation of atom i with neighbor j.- ‘strip’: Use a separated embedding network for the type embedding and combine the output with the radial embedding network output. When type_one_side is False, the input is input_t = concat([tebd_j, tebd_i]). (Supported Backend: PyTorch) When type_one_side is True, the input is input_t = tebd_j. The output is out_ij = embeding_t(input_t) * embeding_s(r_ij) + embeding_s(r_ij) for the pair-wise representation of atom i with neighbor j.
set_davg_zero:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repinit/set_davg_zero
Set the normalization average to zero. This option should be set when atom_ener in the energy fitting is used.
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[dpa2]/repinit/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”..
type_one_side:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa2]/repinit/type_one_side
If true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa2]/repinit/resnet_dt
Whether to use a “Timestep” in the skip connection.
use_three_body:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa2]/repinit/use_three_body
Whether to concatenate three-body representation in the output descriptor.
three_body_neuron:#
type: list, optional, default: [2, 4, 8]
argument path: model[standard]/descriptor[dpa2]/repinit/three_body_neuron
Number of neurons in each hidden layers of the three-body embedding net.When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
three_body_rcut:#
type: float, optional, default: 4.0
argument path: model[standard]/descriptor[dpa2]/repinit/three_body_rcut
The cut-off radius in the three-body representation.
three_body_rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[dpa2]/repinit/three_body_rcut_smth
Where to start smoothing in the three-body representation. For example the 1/r term is smoothed from three_body_rcut to three_body_rcut_smth.
three_body_sel:#
type: str | int, optional, default: 40
argument path: model[standard]/descriptor[dpa2]/repinit/three_body_sel
Maximally possible number of selected neighbors in the three-body representation. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
repformer:#
type: dict
argument path: model[standard]/descriptor[dpa2]/repformer
The arguments used to initialize the repformer block.
rcut:#
type: float
argument path: model[standard]/descriptor[dpa2]/repformer/rcut
The cut-off radius.
rcut_smth:#
type: float
argument path: model[standard]/descriptor[dpa2]/repformer/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth.
nsel:#
type: str | int
argument path: model[standard]/descriptor[dpa2]/repformer/nsel
Maximally possible number of selected neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
nlayers:#
type: int, optional, default: 3
argument path: model[standard]/descriptor[dpa2]/repformer/nlayers
The number of repformer layers.
g1_dim:#
type: int, optional, default: 128
argument path: model[standard]/descriptor[dpa2]/repformer/g1_dim
The dimension of invariant single-atom representation.
g2_dim:#
type: int, optional, default: 16
argument path: model[standard]/descriptor[dpa2]/repformer/g2_dim
The dimension of invariant pair-atom representation.
axis_neuron:#
type: int, optional, default: 4
argument path: model[standard]/descriptor[dpa2]/repformer/axis_neuron
The number of dimension of submatrix in the symmetrization ops.
direct_dist:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa2]/repformer/direct_dist
Whether or not use direct distance as input for the embedding net to get g2 instead of smoothed 1/r.
update_g1_has_conv:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/update_g1_has_conv
Update the g1 rep with convolution term.
update_g1_has_drrd:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/update_g1_has_drrd
Update the g1 rep with the drrd term.
update_g1_has_grrg:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/update_g1_has_grrg
Update the g1 rep with the grrg term.
update_g1_has_attn:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/update_g1_has_attn
Update the g1 rep with the localized self-attention.
update_g2_has_g1g1:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/update_g2_has_g1g1
Update the g2 rep with the g1xg1 term.
update_g2_has_attn:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/update_g2_has_attn
Update the g2 rep with the gated self-attention.
use_sqrt_nnei:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/use_sqrt_nnei
Whether to use the square root of the number of neighbors for symmetrization_op normalization instead of using the number of neighbors directly.
g1_out_conv:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/g1_out_conv
Whether to put the convolutional update of g1 separately outside the concatenated MLP update.
g1_out_mlp:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/g1_out_mlp
Whether to put the self MLP update of g1 separately outside the concatenated MLP update.
update_h2:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa2]/repformer/update_h2
Update the h2 rep.
attn1_hidden:#
type: int, optional, default: 64
argument path: model[standard]/descriptor[dpa2]/repformer/attn1_hidden
The hidden dimension of localized self-attention to update the g1 rep.
attn1_nhead:#
type: int, optional, default: 4
argument path: model[standard]/descriptor[dpa2]/repformer/attn1_nhead
The number of heads in localized self-attention to update the g1 rep.
attn2_hidden:#
type: int, optional, default: 16
argument path: model[standard]/descriptor[dpa2]/repformer/attn2_hidden
The hidden dimension of gated self-attention to update the g2 rep.
attn2_nhead:#
type: int, optional, default: 4
argument path: model[standard]/descriptor[dpa2]/repformer/attn2_nhead
The number of heads in gated self-attention to update the g2 rep.
attn2_has_gate:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa2]/repformer/attn2_has_gate
Whether to use gate in the gated self-attention to update the g2 rep.
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[dpa2]/repformer/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”..
update_style:#
type: str, optional, default: res_avg
argument path: model[standard]/descriptor[dpa2]/repformer/update_style
Style to update a representation. Supported options are: -‘res_avg’: Updates a rep u with: u = 1/sqrt{n+1} (u + u_1 + u_2 + … + u_n) -‘res_incr’: Updates a rep u with: u = u + 1/sqrt{n} (u_1 + u_2 + … + u_n)-‘res_residual’: Updates a rep u with: u = u + (r1*u_1 + r2*u_2 + … + r3*u_n) where r1, r2 … r3 are residual weights defined by update_residual and update_residual_init.
update_residual:#
type: float, optional, default: 0.001
argument path: model[standard]/descriptor[dpa2]/repformer/update_residual
When update using residual mode, the initial std of residual vector weights.
update_residual_init:#
type: str, optional, default: norm
argument path: model[standard]/descriptor[dpa2]/repformer/update_residual_init
When update using residual mode, the initialization mode of residual vector weights.Supported modes are: [‘norm’, ‘const’].
set_davg_zero:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/set_davg_zero
Set the normalization average to zero. This option should be set when atom_ener in the energy fitting is used.
trainable_ln:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/repformer/trainable_ln
Whether to use trainable shift and scale weights in layer normalization.
ln_eps:#
type: float | NoneType, optional, default: None
argument path: model[standard]/descriptor[dpa2]/repformer/ln_eps
The epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
concat_output_tebd:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/concat_output_tebd
Whether to concat type embedding at the output of the descriptor.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[dpa2]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
smooth:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/smooth
Whether to use smoothness in processes such as attention weights calculation.
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[dpa2]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[dpa2]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa2]/trainable
If the parameters in the embedding net is trainable.
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[dpa2]/seed
Random seed for parameter initialization.
add_tebd_to_repinit_out:#
type: bool, optional, default: False, alias: repformer_add_type_ebd_to_seq
argument path: model[standard]/descriptor[dpa2]/add_tebd_to_repinit_out
Add type embedding to the output representation from repinit before inputting it into repformer.
use_econf_tebd:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa2]/use_econf_tebd
(Supported Backend: PyTorch) Whether to use electronic configuration type embedding.
use_tebd_bias:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa2]/use_tebd_bias
Whether to use bias in the type embedding layer.
When type is set to dpa3:
(Supported Backend: PyTorch)
repflow:#
type: dict
argument path: model[standard]/descriptor[dpa3]/repflow
The arguments used to initialize the repflow block.
n_dim:#
type: int, optional, default: 128
argument path: model[standard]/descriptor[dpa3]/repflow/n_dim
The dimension of node representation.
e_dim:#
type: int, optional, default: 64
argument path: model[standard]/descriptor[dpa3]/repflow/e_dim
The dimension of edge representation.
a_dim:#
type: int, optional, default: 64
argument path: model[standard]/descriptor[dpa3]/repflow/a_dim
The dimension of angle representation.
nlayers:#
type: int, optional, default: 6
argument path: model[standard]/descriptor[dpa3]/repflow/nlayers
The number of repflow layers.
e_rcut:#
type: float
argument path: model[standard]/descriptor[dpa3]/repflow/e_rcut
The edge cut-off radius.
e_rcut_smth:#
type: float
argument path: model[standard]/descriptor[dpa3]/repflow/e_rcut_smth
Where to start smoothing for edge. For example the 1/r term is smoothed from rcut to rcut_smth.
e_sel:#
type: str | int
argument path: model[standard]/descriptor[dpa3]/repflow/e_sel
Maximally possible number of selected edge neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
a_rcut:#
type: float
argument path: model[standard]/descriptor[dpa3]/repflow/a_rcut
The angle cut-off radius.
a_rcut_smth:#
type: float
argument path: model[standard]/descriptor[dpa3]/repflow/a_rcut_smth
Where to start smoothing for angle. For example the 1/r term is smoothed from rcut to rcut_smth.
a_sel:#
type: str | int
argument path: model[standard]/descriptor[dpa3]/repflow/a_sel
Maximally possible number of selected angle neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
a_compress_rate:#
type: int, optional, default: 0
argument path: model[standard]/descriptor[dpa3]/repflow/a_compress_rate
The compression rate for angular messages. The default value is 0, indicating no compression. If a non-zero integer c is provided, the node and edge dimensions will be compressed to a_dim/c and a_dim/2c, respectively, within the angular message.
a_compress_e_rate:#
type: int, optional, default: 1
argument path: model[standard]/descriptor[dpa3]/repflow/a_compress_e_rate
The extra compression rate for edge in angular message compression. The default value is 1.When using angular message compression with a_compress_rate c and a_compress_e_rate c_e, the edge dimension will be compressed to (c_e * a_dim / 2c) within the angular message.
a_compress_use_split:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa3]/repflow/a_compress_use_split
Whether to split first sub-vectors instead of linear mapping during angular message compression. The default value is False.
n_multi_edge_message:#
type: int, optional, default: 1
argument path: model[standard]/descriptor[dpa3]/repflow/n_multi_edge_message
The head number of multiple edge messages to update node feature. Default is 1, indicating one head edge message.
axis_neuron:#
type: int, optional, default: 4
argument path: model[standard]/descriptor[dpa3]/repflow/axis_neuron
The number of dimension of submatrix in the symmetrization ops.
fix_stat_std:#
type: float, optional, default: 0.3
argument path: model[standard]/descriptor[dpa3]/repflow/fix_stat_std
If non-zero (default is 0.3), use this constant as the normalization standard deviation instead of computing it from data statistics.
skip_stat:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa3]/repflow/skip_stat
(Deprecated, kept only for compatibility.) This parameter is obsolete and will be removed. If set to True, it forces fix_stat_std=0.3 for backward compatibility. Transition to fix_stat_std parameter immediately.
update_angle:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa3]/repflow/update_angle
Where to update the angle rep. If not, only node and edge rep will be used.
update_style:#
type: str, optional, default: res_residual
argument path: model[standard]/descriptor[dpa3]/repflow/update_style
Style to update a representation. Supported options are: -‘res_avg’: Updates a rep u with: u = 1/sqrt{n+1} (u + u_1 + u_2 + … + u_n) -‘res_incr’: Updates a rep u with: u = u + 1/sqrt{n} (u_1 + u_2 + … + u_n)-‘res_residual’: Updates a rep u with: u = u + (r1*u_1 + r2*u_2 + … + r3*u_n) where r1, r2 … r3 are residual weights defined by update_residual and update_residual_init.
update_residual:#
type: float, optional, default: 0.1
argument path: model[standard]/descriptor[dpa3]/repflow/update_residual
When update using residual mode, the initial std of residual vector weights.
update_residual_init:#
type: str, optional, default: const
argument path: model[standard]/descriptor[dpa3]/repflow/update_residual_init
When update using residual mode, the initialization mode of residual vector weights.Supported modes are: [‘norm’, ‘const’].
optim_update:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa3]/repflow/optim_update
Whether to enable the optimized update method. Uses a more efficient process when enabled. Defaults to True
smooth_edge_update:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa3]/repflow/smooth_edge_update
Whether to make edge update smooth. If True, the edge update from angle message will not use self as padding.
edge_init_use_dist:#
type: bool, optional, default: False, alias: edge_use_dist
argument path: model[standard]/descriptor[dpa3]/repflow/edge_init_use_dist
Whether to use direct distance r to initialize the edge features instead of 1/r. Note that when using this option, the activation function will not be used when initializing edge features.
use_exp_switch:#
type: bool, optional, default: False, alias: use_env_envelope
argument path: model[standard]/descriptor[dpa3]/repflow/use_exp_switch
Whether to use an exponential switch function instead of a polynomial one in the neighbor update. The exponential switch function ensures neighbor contributions smoothly diminish as the interatomic distance r approaches the cutoff radius rcut. Specifically, the function is defined as: s(r) = exp(-exp(20 * (r - rcut_smth) / rcut_smth)) for 0 < r leq rcut, and s(r) = 0 for r > rcut. Here, rcut_smth is an adjustable smoothing factor and should be chosen carefully according to rcut, ensuring s(r) approaches zero smoothly at the cutoff. Typical recommended values are rcut_smth = 5.3 for rcut = 6.0, and 3.5 for rcut = 4.0.
use_dynamic_sel:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa3]/repflow/use_dynamic_sel
Whether to dynamically select neighbors within the cutoff radius. If True, the exact number of neighbors within the cutoff radius is used without padding to a fixed selection numbers. When enabled, users can safely set larger values for e_sel or a_sel (e.g., 1200 or 300, respectively) to guarantee capturing all neighbors within the cutoff radius. Note that when using dynamic selection, the smooth_edge_update must be True.
sel_reduce_factor:#
type: float, optional, default: 10.0
argument path: model[standard]/descriptor[dpa3]/repflow/sel_reduce_factor
Reduction factor applied to neighbor-scale normalization when use_dynamic_sel is True. In the dynamic selection case, neighbor-scale normalization will use e_sel / sel_reduce_factor or a_sel / sel_reduce_factor instead of the raw e_sel or a_sel values, accommodating larger selection numbers.
concat_output_tebd:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa3]/concat_output_tebd
Whether to concat type embedding at the output of the descriptor.
add_chg_spin_ebd:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa3]/add_chg_spin_ebd
Whether to add charge and spin embedding to the descriptor. When enabled, fparam is expected to have 2 values (charge, spin) which are embedded and added to the type embedding.
activation_function:#
type: str, optional, default: silu
argument path: model[standard]/descriptor[dpa3]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”..
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[dpa3]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[dpa3]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[dpa3]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa3]/trainable
If the parameters in the embedding net is trainable.
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[dpa3]/seed
Random seed for parameter initialization.
use_econf_tebd:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa3]/use_econf_tebd
(Supported Backend: PyTorch) Whether to use electronic configuration type embedding.
use_tebd_bias:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[dpa3]/use_tebd_bias
Whether to use bias in the type embedding layer.
use_loc_mapping:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[dpa3]/use_loc_mapping
Whether to use local atom index mapping in training or non-parallel inference. When True, local indexing and mapping are applied to neighbor lists and embeddings during descriptor computation.
When type is set to se_a_ebd_v2 (or its alias se_a_tpe_v2):
(Supported Backend: TensorFlow)
sel:#
type: list[int] | str, optional, default: auto
argument path: model[standard]/descriptor[se_a_ebd_v2]/sel
This parameter set the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
rcut:#
type: float, optional, default: 6.0
argument path: model[standard]/descriptor[se_a_ebd_v2]/rcut
The cut-off radius.
rcut_smth:#
type: float, optional, default: 0.5
argument path: model[standard]/descriptor[se_a_ebd_v2]/rcut_smth
Where to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_a_ebd_v2]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
axis_neuron:#
type: int, optional, default: 4, alias: n_axis_neuron
argument path: model[standard]/descriptor[se_a_ebd_v2]/axis_neuron
Size of the submatrix of G (embedding matrix).
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_a_ebd_v2]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_a_ebd_v2]/resnet_dt
Whether to use a “Timestep” in the skip connection
type_one_side:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_a_ebd_v2]/type_one_side
If true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_a_ebd_v2]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_a_ebd_v2]/trainable
If the parameters in the embedding net is trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_a_ebd_v2]/seed
Random seed for parameter initialization
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_a_ebd_v2]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
env_protection:#
type: float, optional, default: 0.0
argument path: model[standard]/descriptor[se_a_ebd_v2]/env_protection
(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
set_davg_zero:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_a_ebd_v2]/set_davg_zero
Set the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
When type is set to se_a_mask:
(Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. It can accept a variable number of atoms in a frame (Non-PBC system). aparam are required as an indicator matrix for the real/virtual sign of input atoms.
sel:#
type: list[int] | str, optional, default: auto
argument path: model[standard]/descriptor[se_a_mask]/sel
This parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
neuron:#
type: list[int], optional, default: [10, 20, 40]
argument path: model[standard]/descriptor[se_a_mask]/neuron
Number of neurons in each hidden layers of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
axis_neuron:#
type: int, optional, default: 4, alias: n_axis_neuron
argument path: model[standard]/descriptor[se_a_mask]/axis_neuron
Size of the submatrix of G (embedding matrix).
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/descriptor[se_a_mask]/activation_function
The activation function in the embedding net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_a_mask]/resnet_dt
Whether to use a “Timestep” in the skip connection
type_one_side:#
type: bool, optional, default: False
argument path: model[standard]/descriptor[se_a_mask]/type_one_side
If true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
exclude_types:#
type: list[list[int]], optional, default: []
argument path: model[standard]/descriptor[se_a_mask]/exclude_types
The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
precision:#
type: str, optional, default: default
argument path: model[standard]/descriptor[se_a_mask]/precision
The precision of the embedding net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
trainable:#
type: bool, optional, default: True
argument path: model[standard]/descriptor[se_a_mask]/trainable
If the parameters in the embedding net is trainable
seed:#
type: NoneType | int, optional
argument path: model[standard]/descriptor[se_a_mask]/seed
Random seed for parameter initialization
fitting_net:#
type: dict
argument path: model[standard]/fitting_net
The fitting of physical properties.
Depending on the value of type, different sub args are accepted.
type:#
type: str (flag key), default: ener
argument path: model[standard]/fitting_net/type
possible choices: ener, dos, property, polar, dipole
The type of the fitting.
ener: Fit an energy model (potential energy surface).
dos: Fit a density of states model. The total density of states / site-projected density of states labels should be provided by dos.npy or atom_dos.npy in each data system. The file has number of frames lines and number of energy grid columns (times number of atoms in atom_dos.npy). See loss parameter.
property: (Supported Backend: PyTorch)
polar: Fit an atomic polarizability model. Global polarizazbility labels or atomic polarizability labels for all the selected atoms (see sel_type) should be provided by polarizability.npy in each data system. The file with has number of frames lines and 9 times of number of selected atoms columns, or has number of frames lines and 9 columns. See loss parameter.
dipole: Fit an atomic dipole model. Global dipole labels or atomic dipole labels for all the selected atoms (see sel_type) should be provided by dipole.npy in each data system. The file either has number of frames lines and 3 times of number of selected atoms columns, or has number of frames lines and 3 columns. See loss parameter.
When type is set to ener:
Fit an energy model (potential energy surface).
numb_fparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[ener]/numb_fparam
The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
numb_aparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[ener]/numb_aparam
The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
default_fparam:#
type: NoneType | list[float], optional, default: None
argument path: model[standard]/fitting_net[ener]/default_fparam
(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
dim_case_embd:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[ener]/dim_case_embd
(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
neuron:#
type: list[int], optional, default: [120, 120, 120], alias: n_neuron
argument path: model[standard]/fitting_net[ener]/neuron
The number of neurons in each hidden layers of the fitting net. When two hidden layers are of the same size, a skip connection is built.
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/fitting_net[ener]/activation_function
The activation function in the fitting net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
precision:#
type: str, optional, default: default
argument path: model[standard]/fitting_net[ener]/precision
The precision of the fitting net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
resnet_dt:#
type: bool, optional, default: True
argument path: model[standard]/fitting_net[ener]/resnet_dt
Whether to use a “Timestep” in the skip connection
trainable:#
type: bool | list[bool], optional, default: True
argument path: model[standard]/fitting_net[ener]/trainable
Whether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool(Supported Backend: TensorFlow) : Specifies if each layer is trainable. Since the fitting net is composed by hidden layers followed by a output layer, the length of this list should be equal to len(neuron)+1.
rcond:#
type: float | NoneType, optional, default: None
argument path: model[standard]/fitting_net[ener]/rcond
The condition number used to determine the initial energy shift for each type of atoms. See rcond in numpy.linalg.lstsq() for more details.
seed:#
type: NoneType | int, optional
argument path: model[standard]/fitting_net[ener]/seed
Random seed for parameter initialization of the fitting net
atom_ener:#
type: list[float | None], optional, default: []
argument path: model[standard]/fitting_net[ener]/atom_ener
Specify the atomic energy in vacuum for each type
layer_name:#
type: list[str], optional
argument path: model[standard]/fitting_net[ener]/layer_name
The name of the each layer. The length of this list should be equal to n_neuron + 1. If two layers, either in the same fitting or different fittings, have the same name, they will share the same neural network parameters. The shape of these layers should be the same. If null is given for a layer, parameters will not be shared.
use_aparam_as_mask:#
type: bool, optional, default: False
argument path: model[standard]/fitting_net[ener]/use_aparam_as_mask
Whether to use the aparam as a mask in input.If True, the aparam will not be used in fitting net for embedding.When descrpt is se_a_mask, the aparam will be used as a mask to indicate the input atom is real/virtual. And use_aparam_as_mask should be set to True.
When type is set to dos:
Fit a density of states model. The total density of states / site-projected density of states labels should be provided by dos.npy or atom_dos.npy in each data system. The file has number of frames lines and number of energy grid columns (times number of atoms in atom_dos.npy). See loss parameter.
numb_fparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[dos]/numb_fparam
The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
numb_aparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[dos]/numb_aparam
The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
default_fparam:#
type: NoneType | list[float], optional, default: None
argument path: model[standard]/fitting_net[dos]/default_fparam
(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
dim_case_embd:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[dos]/dim_case_embd
(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
neuron:#
type: list[int], optional, default: [120, 120, 120]
argument path: model[standard]/fitting_net[dos]/neuron
The number of neurons in each hidden layers of the fitting net. When two hidden layers are of the same size, a skip connection is built.
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/fitting_net[dos]/activation_function
The activation function in the fitting net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
precision:#
type: str, optional, default: float64
argument path: model[standard]/fitting_net[dos]/precision
The precision of the fitting net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
resnet_dt:#
type: bool, optional, default: True
argument path: model[standard]/fitting_net[dos]/resnet_dt
Whether to use a “Timestep” in the skip connection
trainable:#
type: bool | list[bool], optional, default: True
argument path: model[standard]/fitting_net[dos]/trainable
Whether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool: Specifies if each layer is trainable. Since the fitting net is composed by hidden layers followed by a output layer, the length of this list should be equal to len(neuron)+1.
rcond:#
type: float | NoneType, optional, default: None
argument path: model[standard]/fitting_net[dos]/rcond
The condition number used to determine the initial energy shift for each type of atoms. See rcond in numpy.linalg.lstsq() for more details.
seed:#
type: NoneType | int, optional
argument path: model[standard]/fitting_net[dos]/seed
Random seed for parameter initialization of the fitting net
numb_dos:#
type: int, optional, default: 300
argument path: model[standard]/fitting_net[dos]/numb_dos
The number of gridpoints on which the DOS is evaluated (NEDOS in VASP)
When type is set to property:
(Supported Backend: PyTorch)
numb_fparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[property]/numb_fparam
The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
numb_aparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[property]/numb_aparam
The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
default_fparam:#
type: NoneType | list[float], optional, default: None
argument path: model[standard]/fitting_net[property]/default_fparam
(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
dim_case_embd:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[property]/dim_case_embd
(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
neuron:#
type: list[int], optional, default: [120, 120, 120], alias: n_neuron
argument path: model[standard]/fitting_net[property]/neuron
The number of neurons in each hidden layers of the fitting net. When two hidden layers are of the same size, a skip connection is built
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/fitting_net[property]/activation_function
The activation function in the fitting net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: True
argument path: model[standard]/fitting_net[property]/resnet_dt
Whether to use a “Timestep” in the skip connection
precision:#
type: str, optional, default: default
argument path: model[standard]/fitting_net[property]/precision
The precision of the fitting net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
seed:#
type: NoneType | int, optional
argument path: model[standard]/fitting_net[property]/seed
Random seed for parameter initialization of the fitting net
task_dim:#
type: int, optional, default: 1
argument path: model[standard]/fitting_net[property]/task_dim
The dimension of outputs of fitting net
intensive:#
type: bool, optional, default: False
argument path: model[standard]/fitting_net[property]/intensive
Whether the fitting property is intensive
property_name:#
type: str
argument path: model[standard]/fitting_net[property]/property_name
The names of fitting property, which should be consistent with the property name in the dataset.
trainable:#
type: bool | list[bool], optional, default: True
argument path: model[standard]/fitting_net[property]/trainable
Whether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool: Specifies if each layer is trainable. Since the fitting net is composed by hidden layers followed by a output layer, the length of this list should be equal to len(neuron)+1.
When type is set to polar:
Fit an atomic polarizability model. Global polarizazbility labels or atomic polarizability labels for all the selected atoms (see sel_type) should be provided by polarizability.npy in each data system. The file with has number of frames lines and 9 times of number of selected atoms columns, or has number of frames lines and 9 columns. See loss parameter.
numb_fparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[polar]/numb_fparam
(Supported Backend: PyTorch) The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
numb_aparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[polar]/numb_aparam
(Supported Backend: PyTorch) The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
default_fparam:#
type: NoneType | list[float], optional, default: None
argument path: model[standard]/fitting_net[polar]/default_fparam
(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
dim_case_embd:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[polar]/dim_case_embd
(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
neuron:#
type: list[int], optional, default: [120, 120, 120], alias: n_neuron
argument path: model[standard]/fitting_net[polar]/neuron
The number of neurons in each hidden layers of the fitting net. When two hidden layers are of the same size, a skip connection is built.
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/fitting_net[polar]/activation_function
The activation function in the fitting net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: True
argument path: model[standard]/fitting_net[polar]/resnet_dt
Whether to use a “Timestep” in the skip connection
precision:#
type: str, optional, default: default
argument path: model[standard]/fitting_net[polar]/precision
The precision of the fitting net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
fit_diag:#
type: bool, optional, default: True
argument path: model[standard]/fitting_net[polar]/fit_diag
Fit the diagonal part of the rotational invariant polarizability matrix, which will be converted to normal polarizability matrix by contracting with the rotation matrix.
scale:#
type: float | list[float], optional, default: 1.0
argument path: model[standard]/fitting_net[polar]/scale
The output of the fitting net (polarizability matrix) will be scaled by scale
shift_diag:#
type: bool, optional, default: True
argument path: model[standard]/fitting_net[polar]/shift_diag
Whether to shift the diagonal of polar, which is beneficial to training. Default is true.
sel_type:#
type: NoneType | list[int] | int, optional, alias: pol_type
argument path: model[standard]/fitting_net[polar]/sel_type
The atom types for which the atomic polarizability will be provided. If not set, all types will be selected.(Supported Backend: TensorFlow)
seed:#
type: NoneType | int, optional
argument path: model[standard]/fitting_net[polar]/seed
Random seed for parameter initialization of the fitting net
When type is set to dipole:
Fit an atomic dipole model. Global dipole labels or atomic dipole labels for all the selected atoms (see sel_type) should be provided by dipole.npy in each data system. The file either has number of frames lines and 3 times of number of selected atoms columns, or has number of frames lines and 3 columns. See loss parameter.
numb_fparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[dipole]/numb_fparam
(Supported Backend: PyTorch) The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
numb_aparam:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[dipole]/numb_aparam
(Supported Backend: PyTorch) The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
default_fparam:#
type: NoneType | list[float], optional, default: None
argument path: model[standard]/fitting_net[dipole]/default_fparam
(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
dim_case_embd:#
type: int, optional, default: 0
argument path: model[standard]/fitting_net[dipole]/dim_case_embd
(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
neuron:#
type: list[int], optional, default: [120, 120, 120], alias: n_neuron
argument path: model[standard]/fitting_net[dipole]/neuron
The number of neurons in each hidden layers of the fitting net. When two hidden layers are of the same size, a skip connection is built.
activation_function:#
type: str, optional, default: tanh
argument path: model[standard]/fitting_net[dipole]/activation_function
The activation function in the fitting net. Supported activation functions are “relu6”, “sigmoid”, “none”, “tanh”, “silut”, “gelu”, “linear”, “relu”, “softplus”, “silu”, “gelu_tf”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
resnet_dt:#
type: bool, optional, default: True
argument path: model[standard]/fitting_net[dipole]/resnet_dt
Whether to use a “Timestep” in the skip connection
precision:#
type: str, optional, default: default
argument path: model[standard]/fitting_net[dipole]/precision
The precision of the fitting net parameters, supported options are “float16”, “default”, “bfloat16”, “float64”, “float32”. Default follows the interface precision.
sel_type:#
type: NoneType | list[int] | int, optional, alias: dipole_type
argument path: model[standard]/fitting_net[dipole]/sel_type
The atom types for which the atomic dipole will be provided. If not set, all types will be selected.(Supported Backend: TensorFlow)
seed:#
type: NoneType | int, optional
argument path: model[standard]/fitting_net[dipole]/seed
Random seed for parameter initialization of the fitting net
model_branch_alias:#
type: list[str], optional, default: []
argument path: model[standard]/model_branch_alias
(Supported Backend: PyTorch) List of aliases for this model branch. Multiple aliases can be defined, and any alias can reference this branch throughout the model usage. Used only in multi-task models.
info:#
type: dict, optional, default: {}
argument path: model[standard]/info
(Supported Backend: PyTorch) Dictionary of metadata for this model or model branch. Store arbitrary key-value pairs with model- or branch-specific information. Used in both single- and multi-task models.
When type is set to frozen:
model_file:#
type: str
argument path: model[frozen]/model_file
Path to the frozen model file.
When type is set to pairtab:
(Supported Backend: TensorFlow) Pairwise tabulation energy model.
tab_file:#
type: str
argument path: model[pairtab]/tab_file
Path to the tabulation file.
rcut:#
type: float
argument path: model[pairtab]/rcut
The cut-off radius.
sel:#
type: list[int] | str | int
argument path: model[pairtab]/sel
This parameter set the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
When type is set to pairwise_dprc:
(Supported Backend: TensorFlow)
qm_model:#
type: dict
argument path: model[pairwise_dprc]/qm_model
qmmm_model:#
type: dict
argument path: model[pairwise_dprc]/qmmm_model
When type is set to linear_ener:
(Supported Backend: TensorFlow)
models:#
type: dict | list
argument path: model[linear_ener]/models
The sub-models.
weights:#
type: list | str
argument path: model[linear_ener]/weights
If the type is list of float, a list of weights for each model. If “mean”, the weights are set to be 1 / len(models). If “sum”, the weights are set to be 1.

learning_rate:#

type: dict, optional
argument path: learning_rate
The definition of learning rate
start_lr:#
type: float
argument path: learning_rate/start_lr
The learning rate at the start of the training (after warmup).
stop_lr:#
type: float | NoneType, optional, default: None
argument path: learning_rate/stop_lr
The desired learning rate at the end of training. Mutually exclusive with stop_lr_ratio.
stop_lr_ratio:#
type: float | NoneType, optional, default: None
argument path: learning_rate/stop_lr_ratio
The ratio of stop_lr to start_lr. stop_lr = start_lr * stop_lr_ratio. Mutually exclusive with stop_lr.
warmup_steps:#
type: int, optional, default: 0
argument path: learning_rate/warmup_steps
The number of steps for learning rate warmup. During warmup, the learning rate increases linearly from warmup_start_factor * start_lr to start_lr. Mutually exclusive with warmup_ratio. Default is 0 (no warmup).
warmup_ratio:#
type: float | NoneType, optional, default: None
argument path: learning_rate/warmup_ratio
The ratio of warmup steps to total training steps. The actual number of warmup steps is int(warmup_ratio * num_steps).Mutually exclusive with warmup_steps.
warmup_start_factor:#
type: float, optional, default: 0.0
argument path: learning_rate/warmup_start_factor
The factor of start_lr for the initial warmup learning rate. The warmup learning rate starts from warmup_start_factor * start_lr. Default is 0.0, meaning the learning rate starts from zero.
scale_by_worker:#
type: str, optional, default: linear
argument path: learning_rate/scale_by_worker
When parallel training or batch size scaled, how to alter learning rate. Valid values are linear`(default), `sqrt or none.
Depending on the value of type, different sub args are accepted.
type:#
type: str (flag key), default: exp
argument path: learning_rate/type
possible choices: exp, cosine
The type of the learning rate.
When type is set to exp:
decay_steps:#
type: int, optional, default: 5000
argument path: learning_rate[exp]/decay_steps
The learning rate is decaying every this number of training steps. If decay_steps exceeds the decay phase steps (num_steps - warmup_steps) and decay_rate is not provided, it will be automatically adjusted to a sensible default value.
decay_rate:#
type: float | NoneType, optional, default: None
argument path: learning_rate[exp]/decay_rate
The decay rate for the learning rate. If this is provided, it will be used directly as the decay rate for learning rate instead of calculating it through interpolation between start_lr and stop_lr.
smooth:#
type: bool, optional, default: False
argument path: learning_rate[exp]/smooth
If True, use smooth exponential decay (lr decays continuously). If False (default), use stepped decay (lr decays every decay_steps).
When type is set to cosine:

optimizer:#

type: dict, optional
argument path: optimizer
The definition of optimizer. Supported optimizer types depend on backend: TensorFlow/Paddle: Adam; PyTorch: Adam, AdamW, LKF, AdaMuon, HybridMuon.
Depending on the value of type, different sub args are accepted.
type:#
type: str (flag key), default: Adam
argument path: optimizer/type
possible choices: Adam, AdamW, LKF, AdaMuon, HybridMuon
The type of optimizer to use.
AdamW: (Supported Backend: PyTorch)
LKF: (Supported Backend: PyTorch)
AdaMuon: (Supported Backend: PyTorch)
HybridMuon: (Supported Backend: PyTorch) HybridMuon optimizer (DeePMD-kit custom implementation). This is a Hybrid optimizer that automatically combines Muon and Adam. For matrix params: Muon update with Newton-Schulz based on selected muon_mode. For 1D params: Standard Adam. Name-based Adam routing is enabled: final effective parameter name segment containing ‘bias’ or starting with ‘adam_’ (case-insensitive) always uses Adam (no weight decay); segment starting with ‘adamw_’ (case-insensitive) uses AdamW-style decoupled decay. Trailing numeric ParameterList indices are ignored when deriving the effective segment. This is DIFFERENT from PyTorch’s torch.optim.Muon which ONLY supports 2D parameters.
When type is set to Adam:
adam_beta1:#
type: float, optional, default: 0.9
argument path: optimizer[Adam]/adam_beta1
Adam beta1 coefficient for first moment decay.
adam_beta2:#
type: float, optional, default: 0.999
argument path: optimizer[Adam]/adam_beta2
Adam beta2 coefficient for second moment decay.
weight_decay:#
type: float, optional, default: 0.0
argument path: optimizer[Adam]/weight_decay
Weight decay coefficient for Adam. In PyTorch and Paddle, this is an L2 penalty applied to gradients. TensorFlow does not support weight_decay and requires this value to be 0.
When type is set to AdamW:
(Supported Backend: PyTorch)
adam_beta1:#
type: float, optional, default: 0.9
argument path: optimizer[AdamW]/adam_beta1
(Supported Backend: PyTorch) AdamW beta1 coefficient for first moment decay.
adam_beta2:#
type: float, optional, default: 0.999
argument path: optimizer[AdamW]/adam_beta2
(Supported Backend: PyTorch) AdamW beta2 coefficient for second moment decay.
weight_decay:#
type: float, optional, default: 0.001
argument path: optimizer[AdamW]/weight_decay
(Supported Backend: PyTorch) Decoupled weight decay coefficient for AdamW optimizer (PyTorch only).
When type is set to LKF:
(Supported Backend: PyTorch)
kf_blocksize:#
type: int, optional, default: 5120
argument path: optimizer[LKF]/kf_blocksize
(Supported Backend: PyTorch) The blocksize for the Kalman filter.
kf_start_pref_e:#
type: float, optional, default: 1.0
argument path: optimizer[LKF]/kf_start_pref_e
(Supported Backend: PyTorch) The prefactor of energy loss at the start of Kalman filter updates.
kf_limit_pref_e:#
type: float, optional, default: 1.0
argument path: optimizer[LKF]/kf_limit_pref_e
(Supported Backend: PyTorch) The prefactor of energy loss at the end of training for Kalman filter updates.
kf_start_pref_f:#
type: float, optional, default: 1.0
argument path: optimizer[LKF]/kf_start_pref_f
(Supported Backend: PyTorch) The prefactor of force loss at the start of Kalman filter updates.
kf_limit_pref_f:#
type: float, optional, default: 1.0
argument path: optimizer[LKF]/kf_limit_pref_f
(Supported Backend: PyTorch) The prefactor of force loss at the end of training for Kalman filter updates.
When type is set to AdaMuon:
(Supported Backend: PyTorch)
momentum:#
type: float, optional, default: 0.95, alias: muon_momentum
argument path: optimizer[AdaMuon]/momentum
(Supported Backend: PyTorch) Momentum coefficient for AdaMuon optimizer.
adam_beta1:#
type: float, optional, default: 0.9
argument path: optimizer[AdaMuon]/adam_beta1
(Supported Backend: PyTorch) Adam beta1 coefficient for AdaMuon optimizer.
adam_beta2:#
type: float, optional, default: 0.95
argument path: optimizer[AdaMuon]/adam_beta2
(Supported Backend: PyTorch) Adam beta2 coefficient for AdaMuon optimizer.
weight_decay:#
type: float, optional, default: 0.001
argument path: optimizer[AdaMuon]/weight_decay
(Supported Backend: PyTorch) Weight decay coefficient. Applied only to >=2D parameters (AdaMuon path).
lr_adjust:#
type: float, optional, default: 10.0
argument path: optimizer[AdaMuon]/lr_adjust
(Supported Backend: PyTorch) Learning rate adjustment factor for Adam (1D params). If lr_adjust <= 0: use match-RMS scaling (scale = lr_adjust_coeff * sqrt(max(m, n))), Adam uses lr directly. If lr_adjust > 0: use rectangular correction (scale = sqrt(max(1.0, m/n))), Adam uses lr/lr_adjust.
lr_adjust_coeff:#
type: float, optional, default: 0.2
argument path: optimizer[AdaMuon]/lr_adjust_coeff
(Supported Backend: PyTorch) Coefficient for match-RMS scaling. Only effective when lr_adjust <= 0.
When type is set to HybridMuon:
(Supported Backend: PyTorch) HybridMuon optimizer (DeePMD-kit custom implementation). This is a Hybrid optimizer that automatically combines Muon and Adam. For matrix params: Muon update with Newton-Schulz based on selected muon_mode. For 1D params: Standard Adam. Name-based Adam routing is enabled: final effective parameter name segment containing ‘bias’ or starting with ‘adam_’ (case-insensitive) always uses Adam (no weight decay); segment starting with ‘adamw_’ (case-insensitive) uses AdamW-style decoupled decay. Trailing numeric ParameterList indices are ignored when deriving the effective segment. This is DIFFERENT from PyTorch’s torch.optim.Muon which ONLY supports 2D parameters.
momentum:#
type: float, optional, default: 0.95, alias: muon_momentum
argument path: optimizer[HybridMuon]/momentum
(Supported Backend: PyTorch) Momentum coefficient for HybridMuon optimizer (>=2D params). Used in Nesterov momentum update: m_t = beta*m_{t-1} + (1-beta)*g_t.
adam_beta1:#
type: float, optional, default: 0.9
argument path: optimizer[HybridMuon]/adam_beta1
(Supported Backend: PyTorch) Adam beta1 coefficient for 1D parameters (biases, norms).
adam_beta2:#
type: float, optional, default: 0.95
argument path: optimizer[HybridMuon]/adam_beta2
(Supported Backend: PyTorch) Adam beta2 coefficient for 1D parameters (biases, norms).
weight_decay:#
type: float, optional, default: 0.001
argument path: optimizer[HybridMuon]/weight_decay
(Supported Backend: PyTorch) Weight decay coefficient. Applied only to Muon-routed parameters
lr_adjust:#
type: float, optional, default: 0.0
argument path: optimizer[HybridMuon]/lr_adjust
(Supported Backend: PyTorch) Learning rate adjustment mode for HybridMuon scaling and Adam learning rate. If lr_adjust <= 0: use match-RMS scaling (scale = coeff*sqrt(max(m,n))), Adam uses lr directly. If lr_adjust > 0: use rectangular correction (scale = sqrt(max(1, m/n))), Adam uses lr/lr_adjust. Default is 0.0 (match-RMS scaling).
lr_adjust_coeff:#
type: float, optional, default: 0.2
argument path: optimizer[HybridMuon]/lr_adjust_coeff
(Supported Backend: PyTorch) Coefficient for match-RMS scaling. Only effective when lr_adjust <= 0.
muon_mode:#
type: str, optional, default: slice
argument path: optimizer[HybridMuon]/muon_mode
(Supported Backend: PyTorch) Muon routing mode. ‘2d’: only effective-rank-2 params are eligible for Muon; effective rank >2 goes to AdamW-style decoupled decay path. ‘flat’: effective-rank >=2 params are flattened to matrix-view (prod(shape[:-1]), shape[-1]) for Muon. ‘slice’ (default): effective-rank >=3 params use per-slice Muon on the last two dimensions; no cross-slice mixing. Routing uses effective shape after removing singleton dimensions.
flash_muon:#
type: bool, optional, default: True
argument path: optimizer[HybridMuon]/flash_muon
(Supported Backend: PyTorch) Enable triton-accelerated Newton-Schulz orthogonalization. Requires triton and CUDA. Falls back to PyTorch implementation when triton is unavailable or running on CPU.
magma_muon:#
type: bool, optional, default: False
argument path: optimizer[HybridMuon]/magma_muon
(Supported Backend: PyTorch) Enable Magma-lite damping on the Muon route only. When enabled, HybridMuon computes momentum-gradient alignment per Muon block, applies EMA smoothing, and rescales Muon updates to improve stability. Adam/AdamW routes are unchanged.

loss:#

type: dict, optional
argument path: loss
The definition of loss function. The loss type should be set to tensor, ener or left unset.
Depending on the value of type, different sub args are accepted.
type:#
type: str (flag key), default: ener
argument path: loss/type
possible choices: ener, ener_spin, dos, property, tensor
The type of the loss. When the fitting type is ener, the loss type should be set to ener or left unset. When the fitting type is dipole or polar, the loss type should be set to tensor.
When type is set to ener:
start_pref_e:#
type: float | int, optional, default: 0.02
argument path: loss[ener]/start_pref_e
The prefactor of energy loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the energy label should be provided by file energy.npy in each data system. If both start_pref_e and limit_pref_e are set to 0, then the energy will be ignored.
limit_pref_e:#
type: float | int, optional, default: 1.0
argument path: loss[ener]/limit_pref_e
The prefactor of energy loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_f:#
type: float | int, optional, default: 1000
argument path: loss[ener]/start_pref_f
The prefactor of force loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the force label should be provided by file force.npy in each data system. If both start_pref_f and limit_pref_f are set to 0, then the force will be ignored.
limit_pref_f:#
type: float | int, optional, default: 1.0
argument path: loss[ener]/limit_pref_f
The prefactor of force loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_v:#
type: float | int, optional, default: 0.0
argument path: loss[ener]/start_pref_v
The prefactor of virial loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the virial label should be provided by file virial.npy in each data system. If both start_pref_v and limit_pref_v are set to 0, then the virial will be ignored.
limit_pref_v:#
type: float | int, optional, default: 0.0
argument path: loss[ener]/limit_pref_v
The prefactor of virial loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_h:#
type: float | int, optional, default: 0.0
argument path: loss[ener]/start_pref_h
The prefactor of hessian loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the hessian label should be provided by file hessian.npy in each data system. If both start_pref_h and limit_pref_h are set to 0, then the hessian will be ignored.
limit_pref_h:#
type: float | int, optional, default: 0.0
argument path: loss[ener]/limit_pref_h
The prefactor of hessian loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_ae:#
type: float | int, optional, default: 0.0
argument path: loss[ener]/start_pref_ae
The prefactor of atomic energy loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atom_ener label should be provided by file atom_ener.npy in each data system. If both start_pref_ae and limit_pref_ae are set to 0, then the atomic energy will be ignored.
limit_pref_ae:#
type: float | int, optional, default: 0.0
argument path: loss[ener]/limit_pref_ae
The prefactor of atomic energy loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_pf:#
type: float | int, optional, default: 0.0
argument path: loss[ener]/start_pref_pf
The prefactor of atomic prefactor force loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atom_pref label should be provided by file atom_pref.npy in each data system. If both start_pref_pf and limit_pref_pf are set to 0, then the atomic prefactor force will be ignored.
limit_pref_pf:#
type: float | int, optional, default: 0.0
argument path: loss[ener]/limit_pref_pf
The prefactor of atomic prefactor force loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
relative_f:#
type: float | NoneType, optional
argument path: loss[ener]/relative_f
If provided, relative force error will be used in the loss. The difference of force will be normalized by the magnitude of the force in the label with a shift given by relative_f, i.e. DF_i / ( || F || + relative_f ) with DF denoting the difference between prediction and label and || F || denoting the L2 norm of the label.
enable_atom_ener_coeff:#
type: bool, optional, default: False
argument path: loss[ener]/enable_atom_ener_coeff
If true, the energy will be computed as sum_i c_i E_i. c_i should be provided by file atom_ener_coeff.npy in each data system, otherwise it’s 1.
start_pref_gf:#
type: float, optional, default: 0.0
argument path: loss[ener]/start_pref_gf
The prefactor of generalized force loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the drdq label should be provided by file drdq.npy in each data system. If both start_pref_gf and limit_pref_gf are set to 0, then the generalized force will be ignored.
limit_pref_gf:#
type: float, optional, default: 0.0
argument path: loss[ener]/limit_pref_gf
The prefactor of generalized force loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
numb_generalized_coord:#
type: int, optional, default: 0
argument path: loss[ener]/numb_generalized_coord
The dimension of generalized coordinates. Required when generalized force loss is used.
use_huber:#
type: bool, optional, default: False
argument path: loss[ener]/use_huber
Enables Huber loss calculation for energy/force/virial terms with user-defined threshold delta (D). The loss function smoothly transitions between L2 and L1 loss:
For absolute prediction errors within D: quadratic loss 0.5 * (error**2)
For absolute errors exceeding D: linear loss D * (|error| - 0.5 * D)
Formula: loss = 0.5 * (error**2) if |error| <= D else D * (|error| - 0.5 * D).
huber_delta:#
type: float, optional, default: 0.01
argument path: loss[ener]/huber_delta
The threshold delta (D) used for Huber loss, controlling transition between L2 and L1 loss.
When type is set to ener_spin:
start_pref_e:#
type: float | int, optional, default: 0.02
argument path: loss[ener_spin]/start_pref_e
The prefactor of energy loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the energy label should be provided by file energy.npy in each data system. If both start_pref_energy and limit_pref_energy are set to 0, then the energy will be ignored.
limit_pref_e:#
type: float | int, optional, default: 1.0
argument path: loss[ener_spin]/limit_pref_e
The prefactor of energy loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_fr:#
type: float | int, optional, default: 1000
argument path: loss[ener_spin]/start_pref_fr
The prefactor of force_real_atom loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the force_real_atom label should be provided by file force_real_atom.npy in each data system. If both start_pref_force_real_atom and limit_pref_force_real_atom are set to 0, then the force_real_atom will be ignored.
limit_pref_fr:#
type: float | int, optional, default: 1.0
argument path: loss[ener_spin]/limit_pref_fr
The prefactor of force_real_atom loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_fm:#
type: float | int, optional, default: 10000
argument path: loss[ener_spin]/start_pref_fm
The prefactor of force_magnetic loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the force_magnetic label should be provided by file force_magnetic.npy in each data system. If both start_pref_force_magnetic and limit_pref_force_magnetic are set to 0, then the force_magnetic will be ignored.
limit_pref_fm:#
type: float | int, optional, default: 10.0
argument path: loss[ener_spin]/limit_pref_fm
The prefactor of force_magnetic loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_v:#
type: float | int, optional, default: 0.0
argument path: loss[ener_spin]/start_pref_v
The prefactor of virial loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the virial label should be provided by file virial.npy in each data system. If both start_pref_virial and limit_pref_virial are set to 0, then the virial will be ignored.
limit_pref_v:#
type: float | int, optional, default: 0.0
argument path: loss[ener_spin]/limit_pref_v
The prefactor of virial loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_ae:#
type: float | int, optional, default: 0.0
argument path: loss[ener_spin]/start_pref_ae
The prefactor of atom_ener loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atom_ener label should be provided by file atom_ener.npy in each data system. If both start_pref_atom_ener and limit_pref_atom_ener are set to 0, then the atom_ener will be ignored.
limit_pref_ae:#
type: float | int, optional, default: 0.0
argument path: loss[ener_spin]/limit_pref_ae
The prefactor of atom_ener loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_pf:#
type: float | int, optional, default: 0.0
argument path: loss[ener_spin]/start_pref_pf
The prefactor of atom_pref loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atom_pref label should be provided by file atom_pref.npy in each data system. If both start_pref_atom_pref and limit_pref_atom_pref are set to 0, then the atom_pref will be ignored.
limit_pref_pf:#
type: float | int, optional, default: 0.0
argument path: loss[ener_spin]/limit_pref_pf
The prefactor of atom_pref loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
relative_f:#
type: float | NoneType, optional
argument path: loss[ener_spin]/relative_f
If provided, relative force error will be used in the loss. The difference of force will be normalized by the magnitude of the force in the label with a shift given by relative_f, i.e. DF_i / ( || F || + relative_f ) with DF denoting the difference between prediction and label and || F || denoting the L2 norm of the label.
enable_atom_ener_coeff:#
type: bool, optional, default: False
argument path: loss[ener_spin]/enable_atom_ener_coeff
If true, the energy will be computed as sum_i c_i E_i. c_i should be provided by file atom_ener_coeff.npy in each data system, otherwise it’s 1.
When type is set to dos:
start_pref_dos:#
type: float | int, optional, default: 0.0
argument path: loss[dos]/start_pref_dos
The prefactor of Density of State (DOS) loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the Density of State (DOS) label should be provided by file Density of State (DOS).npy in each data system. If both start_pref_Density of State (DOS) and limit_pref_Density of State (DOS) are set to 0, then the Density of State (DOS) will be ignored.
limit_pref_dos:#
type: float | int, optional, default: 0.0
argument path: loss[dos]/limit_pref_dos
The prefactor of Density of State (DOS) loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_cdf:#
type: float | int, optional, default: 0.0
argument path: loss[dos]/start_pref_cdf
The prefactor of Cumulative Distribution Function (cumulative integral of DOS) loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the Cumulative Distribution Function (cumulative integral of DOS) label should be provided by file Cumulative Distribution Function (cumulative integral of DOS).npy in each data system. If both start_pref_Cumulative Distribution Function (cumulative integral of DOS) and limit_pref_Cumulative Distribution Function (cumulative integral of DOS) are set to 0, then the Cumulative Distribution Function (cumulative integral of DOS) will be ignored.
limit_pref_cdf:#
type: float | int, optional, default: 0.0
argument path: loss[dos]/limit_pref_cdf
The prefactor of Cumulative Distribution Function (cumulative integral of DOS) loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_ados:#
type: float | int, optional, default: 1.0
argument path: loss[dos]/start_pref_ados
The prefactor of atomic DOS (site-projected DOS) loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atomic DOS (site-projected DOS) label should be provided by file atomic DOS (site-projected DOS).npy in each data system. If both start_pref_atomic DOS (site-projected DOS) and limit_pref_atomic DOS (site-projected DOS) are set to 0, then the atomic DOS (site-projected DOS) will be ignored.
limit_pref_ados:#
type: float | int, optional, default: 1.0
argument path: loss[dos]/limit_pref_ados
The prefactor of atomic DOS (site-projected DOS) loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
start_pref_acdf:#
type: float | int, optional, default: 0.0
argument path: loss[dos]/start_pref_acdf
The prefactor of Cumulative integral of atomic DOS loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the Cumulative integral of atomic DOS label should be provided by file Cumulative integral of atomic DOS.npy in each data system. If both start_pref_Cumulative integral of atomic DOS and limit_pref_Cumulative integral of atomic DOS are set to 0, then the Cumulative integral of atomic DOS will be ignored.
limit_pref_acdf:#
type: float | int, optional, default: 0.0
argument path: loss[dos]/limit_pref_acdf
The prefactor of Cumulative integral of atomic DOS loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
When type is set to property:
loss_func:#
type: str, optional, default: smooth_mae
argument path: loss[property]/loss_func
The loss function to minimize, such as ‘mae’,’smooth_mae’.
metric:#
type: list, optional, default: ['mae']
argument path: loss[property]/metric
The metric for display. This list can include ‘smooth_mae’, ‘mae’, ‘mse’ and ‘rmse’.
beta:#
type: float | int, optional, default: 1.0
argument path: loss[property]/beta
The ‘beta’ parameter in ‘smooth_mae’ loss.
When type is set to tensor:
pref:#
type: float | int
argument path: loss[tensor]/pref
The prefactor of the weight of global loss. It should be larger than or equal to 0. It controls the weight of loss corresponding to global label, i.e. ‘polarizability.npy` or dipole.npy, whose shape should be #frames x [9 or 3]. If it’s larger than 0.0, this npy should be included.
pref_atomic:#
type: float | int
argument path: loss[tensor]/pref_atomic
The prefactor of the weight of atomic loss. It should be larger than or equal to 0. It controls the weight of loss corresponding to atomic label, i.e. atomic_polarizability.npy or atomic_dipole.npy, whose shape should be #frames x ([9 or 3] x #atoms). If it’s larger than 0.0, this npy should be included. Both pref and pref_atomic should be provided, and either can be set to 0.0.
enable_atomic_weight:#
type: bool, optional, default: False
argument path: loss[tensor]/enable_atomic_weight
If true, the atomic loss will be reweighted.

training:#

type: dict
argument path: training
The training options.
training_data:#
type: dict, optional
argument path: training/training_data
Configurations of training data.
systems:#
type: list[str] | str
argument path: training/training_data/systems
The data systems for training. This key can be a list or a str. When provided as a string, it can be a system directory path (containing ‘type.raw’) or a parent directory path to recursively search for all system subdirectories. When provided as a list, each string item in the list is processed the same way as individual string inputs, i.e., each path can be a system directory or a parent directory to recursively search for all system subdirectories.
rglob_patterns:#
type: NoneType | list[str], optional, default: None
argument path: training/training_data/rglob_patterns
The customized patterns used in rglob to collect all training systems. (Supported Backend: PyTorch)
batch_size:#
type: list[int] | str | int, optional, default: auto
argument path: training/training_data/batch_size
This key can be
list: the length of which is the same as the systems. The batch size of each system is given by the elements of the list.
int: all systems use the same batch size.
string “auto”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no less than 32.
string “auto:N”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no less than N.
string “mixed:N”: the batch data will be sampled from all systems and merged into a mixed system with the batch size N. Only support the se_atten descriptor for TensorFlow backend.
string “max:N”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no more than N.
string “filter:N”: the same as “max:N” but removes the systems with the number of atoms larger than N from the data set.
If MPI is used, the value should be considered as the batch size per task.
auto_prob:#
type: str, optional, default: prob_sys_size, alias: auto_prob_style
argument path: training/training_data/auto_prob
Determine the probability of systems automatically. The method is assigned by this key and can be
“prob_uniform” : the probability all the systems are equal, namely 1.0/self.get_nsystems()
“prob_sys_size” : the probability of a system is proportional to the number of batches in the system
“prob_sys_size;stt_idx:end_idx:weight;stt_idx:end_idx:weight;…” : the list of systems is divided into blocks. A block is specified by stt_idx:end_idx:weight, where stt_idx is the starting index of the system, end_idx is then ending (not including) index of the system, the probabilities of the systems in this block sums up to weight, and the relatively probabilities within this block is proportional to the number of batches in the system.
sys_probs:#
type: NoneType | list[float], optional, default: None, alias: sys_weights
argument path: training/training_data/sys_probs
A list of float if specified. Should be of the same length as systems, specifying the probability of each system.
validation_data:#
type: NoneType | dict, optional, default: None
argument path: training/validation_data
Configurations of validation data. Similar to that of training data, except that a numb_btch argument may be configured
systems:#
type: list[str] | str
argument path: training/validation_data/systems
The data systems for validation. This key can be a list or a str. When provided as a string, it can be a system directory path (containing ‘type.raw’) or a parent directory path to recursively search for all system subdirectories. When provided as a list, each string item in the list is processed the same way as individual string inputs, i.e., each path can be a system directory or a parent directory to recursively search for all system subdirectories.
rglob_patterns:#
type: NoneType | list[str], optional, default: None
argument path: training/validation_data/rglob_patterns
The customized patterns used in rglob to collect all validation systems. (Supported Backend: PyTorch)
batch_size:#
type: list[int] | str | int, optional, default: auto
argument path: training/validation_data/batch_size
This key can be
list: the length of which is the same as the systems. The batch size of each system is given by the elements of the list.
int: all systems use the same batch size.
string “auto”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no less than 32.
string “auto:N”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no less than N.
auto_prob:#
type: str, optional, default: prob_sys_size, alias: auto_prob_style
argument path: training/validation_data/auto_prob
Determine the probability of systems automatically. The method is assigned by this key and can be
“prob_uniform” : the probability all the systems are equal, namely 1.0/self.get_nsystems()
“prob_sys_size” : the probability of a system is proportional to the number of batches in the system
“prob_sys_size;stt_idx:end_idx:weight;stt_idx:end_idx:weight;…” : the list of systems is divided into blocks. A block is specified by stt_idx:end_idx:weight, where stt_idx is the starting index of the system, end_idx is then ending (not including) index of the system, the probabilities of the systems in this block sums up to weight, and the relatively probabilities within this block is proportional to the number of batches in the system.
sys_probs:#
type: NoneType | list[float], optional, default: None, alias: sys_weights
argument path: training/validation_data/sys_probs
A list of float if specified. Should be of the same length as systems, specifying the probability of each system.
numb_btch:#
type: int, optional, default: 1, alias: numb_batch
argument path: training/validation_data/numb_btch
An integer that specifies the number of batches to be sampled for each validation period.
stat_file:#
type: str, optional
argument path: training/stat_file
(Supported Backend: PyTorch) The file path for saving the data statistics results. If set, the results will be saved and directly loaded during the next training session, avoiding the need to recalculate the statistics. If the file extension is .h5 or .hdf5, an HDF5 file is used to store the statistics; otherwise, a directory containing NumPy binary files are used.
mixed_precision:#
type: dict, optional
argument path: training/mixed_precision
Configurations of mixed precision.
output_prec:#
type: str, optional, default: float32
argument path: training/mixed_precision/output_prec
The precision for mixed precision params. “ “The trainable variables precision during the mixed precision training process, “ “supported options are float32 only currently.
compute_prec:#
type: str
argument path: training/mixed_precision/compute_prec
The precision for mixed precision compute. “ “The compute precision during the mixed precision training process, “” “supported options are float16 and bfloat16 currently.
numb_steps:#
type: int, optional, aliases: stop_batch, num_step, num_steps, numb_step
argument path: training/numb_steps
Number of training steps (num_step). Each training uses one batch of data. Mutually exclusive with num_epoch in single-task mode. In multi-task mode, this is mutually exclusive with num_epoch_dict. Accepted names: num_step, num_steps, numb_step, numb_steps, stop_batch.
numb_epoch:#
type: float | int, optional, aliases: num_epochs, num_epoch, numb_epochs
argument path: training/numb_epoch
Number of training epochs (num_epoch; can be fractional) for single-task mode only. Because each step samples the dataset stochastically, this corresponds to an expected epoch count rather than a deterministic full pass. When num_step is not set, the total steps are computed as ceil(num_epoch * total_numb_batch). total_numb_batch is computed as ceil(max_i(n_bch_i / p_i)), where n_bch_i is the number of batches for system i and p_i is the sampling probability after sys_probs/auto_prob normalization. Mutually exclusive with num_step. For multi-task mode, use num_epoch_dict instead. Accepted names: num_epoch, num_epochs, numb_epoch, numb_epochs.
seed:#
type: NoneType | int, optional
argument path: training/seed
The random seed for getting frames from the training data set.
disp_file:#
type: str, optional, default: lcurve.out
argument path: training/disp_file
The file for printing learning curve.
disp_freq:#
type: int, optional, default: 1000
argument path: training/disp_freq
The frequency of printing learning curve.
save_freq:#
type: int, optional, default: 1000
argument path: training/save_freq
The frequency of saving check point.
save_ckpt:#
type: str, optional, default: model.ckpt
argument path: training/save_ckpt
The path prefix of saving check point files.
max_ckpt_keep:#
type: int, optional, default: 5
argument path: training/max_ckpt_keep
The maximum number of checkpoints to keep. The oldest checkpoints will be deleted once the number of checkpoints exceeds max_ckpt_keep. Defaults to 5.
change_bias_after_training:#
type: bool, optional, default: False
argument path: training/change_bias_after_training
Whether to change the output bias after the last training step, by performing predictions using trained model on training data and doing least square on the errors to add the target shift on the bias.
disp_training:#
type: bool, optional, default: True
argument path: training/disp_training
Displaying verbose information during training.
time_training:#
type: bool, optional, default: True
argument path: training/time_training
Timing during training.
disp_avg:#
type: bool, optional, default: False
argument path: training/disp_avg
(Supported Backend: PyTorch) Display the average loss over the display interval for training sets.
profiling:#
type: bool, optional, default: False
argument path: training/profiling
Export the profiling results to the Chrome JSON file for performance analysis, driven by the legacy TensorFlow profiling API or PyTorch Profiler. The output file will be saved to profiling_file. In the PyTorch backend, when enable_profiler is True, this option is ignored, since the profiling results will be saved to the TensorBoard log.
profiling_file:#
type: str, optional, default: timeline.json
argument path: training/profiling_file
Output file for profiling.
enable_profiler:#
type: bool, optional, default: False
argument path: training/enable_profiler
Export the profiling results to the TensorBoard log for performance analysis, driven by TensorFlow Profiler (available in TensorFlow 2.3) or PyTorch Profiler. The log will be saved to tensorboard_log_dir.
tensorboard:#
type: bool, optional, default: False
argument path: training/tensorboard
Enable tensorboard
tensorboard_log_dir:#
type: str, optional, default: log
argument path: training/tensorboard_log_dir
The log directory of tensorboard outputs
tensorboard_freq:#
type: int, optional, default: 1
argument path: training/tensorboard_freq
The frequency of writing tensorboard events.
gradient_max_norm:#
type: float, optional
argument path: training/gradient_max_norm
(Supported Backend: PyTorch) Clips the gradient norm to a maximum value. If the gradient norm exceeds this value, it will be clipped to this limit. No gradient clipping will occur if set to 0.
acc_freq:#
type: int, optional, default: 1
argument path: training/acc_freq
(Supported Backend: Paddle) Gradient accumulation steps (number of steps to accumulate gradients before performing an update).
zero_stage:#
type: int, optional, default: 0
argument path: training/zero_stage
(Supported Backend: PyTorch) ZeRO optimization stage for distributed training memory reduction. 0: standard DDP, lowest communication overhead but highest memory usage (full optimizer states, gradients, and parameters replicated on every GPU). 1: DDP + ZeRO stage-1, shards optimizer states across GPUs via ZeroRedundancyOptimizer; same communication volume as DDP (2x model size) but reduces optimizer memory to 1/N per GPU. 2: FSDP2 stage-2, shards optimizer states and gradients; same communication volume as stage-1 but further reduces gradient memory to 1/N per GPU. Note: FSDP2 introduces DTensor dispatch overhead that can slow down models with many small layers; use torch.compile to mitigate. 3: FSDP2 stage-3, shards parameters as well; maximum memory savings but 50% more communication (3x model size) due to parameter all-gather in both forward and backward passes. Default is 0. Requires distributed launch via torchrun. Currently supports single-task training; does not support LKF or change_bias_after_training.
enable_compile:#
type: bool, optional, default: False
argument path: training/enable_compile
(Supported Backend: PyTorch Exportable) Enable torch.compile to accelerate training. Uses make_fx to decompose autograd into primitive ops, then compiles with torch.compile/Inductor for kernel fusion. The first training step will be slower due to one-time compilation.

nvnmd:#

type: dict, optional
argument path: nvnmd
The nvnmd options.
version:#
type: int
argument path: nvnmd/version
configuration the nvnmd version (0 | 1), 0 for 4 types, 1 for 32 types
max_nnei:#
type: int
argument path: nvnmd/max_nnei
configuration the max number of neighbors, 128|256 for version 0, 128 for version 1
net_size:#
type: int
argument path: nvnmd/net_size
configuration the number of nodes of fitting_net, just can be set as 128
map_file:#
type: str
argument path: nvnmd/map_file
A file containing the mapping tables to replace the calculation of embedding nets
config_file:#
type: str
argument path: nvnmd/config_file
A file containing the parameters about how to implement the model in certain hardware
weight_file:#
type: str
argument path: nvnmd/weight_file
a *.npy file containing the weights of the model
enable:#
type: bool
argument path: nvnmd/enable
enable the nvnmd training
restore_descriptor:#
type: bool
argument path: nvnmd/restore_descriptor
enable to restore the parameter of embedding_net from weight.npy
restore_fitting_net:#
type: bool
argument path: nvnmd/restore_fitting_net
enable to restore the parameter of fitting_net from weight.npy
quantize_descriptor:#
type: bool
argument path: nvnmd/quantize_descriptor
enable the quantizatioin of descriptor
quantize_fitting_net:#
type: bool
argument path: nvnmd/quantize_fitting_net
enable the quantizatioin of fitting_net

5.4.1. Writing JSON files using Visual Studio Code#

When writing JSON files using Visual Studio Code, one can benefit from IntelliSense and validation by adding a JSON schema. To do so, in a VS Code workspace, one can generate a JSON schema file for the input file by running the following command:

dp doc-train-input --out-type json_schema > deepmd.json

Then one can map the schema by updating the workspace settings in the .vscode/settings.json file as follows:

{
   "json.schemas": [
      {
            "fileMatch": [
               "/**/*.json"
            ],
            "url": "./deepmd.json"
      }
   ]
}

Training Parameters

Contents

5.4. Training Parameters#

5.4.1. Writing JSON files using Visual Studio Code#