dpgen simplify parameters

simplify_jdata:
type: dict
argument path: simplify_jdata

Parameters for simplify.json, the first argument of dpgen simplify.

type_map:
type: list
argument path: simplify_jdata/type_map

Atom types. Reminder: The elements in param.json, type.raw and data.lmp(when using lammps) should be in the same order.

mass_map:
type: str | list, optional, default: auto
argument path: simplify_jdata/mass_map

Standard atomic weights (default: “auto”). if one want to use isotopes, or non-standard element names, chemical symbols, or atomic number in the type_map list, please customize the mass_map list instead of using “auto”.

use_ele_temp:
type: int, optional, default: 0
argument path: simplify_jdata/use_ele_temp

Currently only support fp_style vasp.

  • 0: no electron temperature.

  • 1: eletron temperature as frame parameter.

  • 2: electron temperature as atom parameter.

init_data_prefix:
type: str, optional
argument path: simplify_jdata/init_data_prefix

Prefix of initial data directories.

init_data_sys:
type: list
argument path: simplify_jdata/init_data_sys

Paths of initial data. The path can be either a system diretory containing NumPy files or an HDF5 file. You may use either absolute or relative path here. Systems will be detected recursively in the directories or the HDF5 file.

sys_format:
type: str, optional, default: vasp/poscar
argument path: simplify_jdata/sys_format

Format of sys_configs.

init_batch_size:
type: str | list, optional
argument path: simplify_jdata/init_batch_size

Each number is the batch_size of corresponding system for training in init_data_sys. One recommended rule for setting the sys_batch_size and init_batch_size is that batch_size mutiply number of atoms ot the stucture should be larger than 32. If set to auto, batch size will be 32 divided by number of atoms.

sys_configs_prefix:
type: str, optional
argument path: simplify_jdata/sys_configs_prefix

Prefix of sys_configs.

sys_configs:
type: list
argument path: simplify_jdata/sys_configs

Containing directories of structures to be explored in iterations.Wildcard characters are supported here.

sys_batch_size:
type: list, optional
argument path: simplify_jdata/sys_batch_size

Each number is the batch_size for training of corresponding system in sys_configs. If set to auto, batch size will be 32 divided by number of atoms.

labeled:
type: bool, optional, default: False
argument path: simplify_jdata/labeled

If true, the initial data is labeled.

pick_data:
type: str | list
argument path: simplify_jdata/pick_data

(List of) Path to the directory with the pick data with the deepmd/npy or the HDF5 file with deepmd/hdf5 format. Systems are detected recursively.

init_pick_number:
type: int
argument path: simplify_jdata/init_pick_number

The number of initial pick data.

iter_pick_number:
type: int
argument path: simplify_jdata/iter_pick_number

The number of pick data in each iteration.

model_devi_f_trust_lo:
type: float
argument path: simplify_jdata/model_devi_f_trust_lo

The lower bound of forces for the selection for the model deviation.

model_devi_f_trust_hi:
type: float
argument path: simplify_jdata/model_devi_f_trust_hi

The higher bound of forces for the selection for the model deviation.

numb_models:
type: int
argument path: simplify_jdata/numb_models

Number of models to be trained in 00.train. 4 is recommend.

training_iter0_model_path:
type: list, optional
argument path: simplify_jdata/training_iter0_model_path

The model used to init the first iter training. Number of element should be equal to numb_models.

training_init_model:
type: bool, optional
argument path: simplify_jdata/training_init_model

Iteration > 0, the model parameters will be initilized from the model trained at the previous iteration. Iteration == 0, the model parameters will be initialized from training_iter0_model_path.

default_training_param:
type: dict
argument path: simplify_jdata/default_training_param

Training parameters for deepmd-kit in 00.train. You can find instructions from here: (https://github.com/deepmodeling/deepmd-kit).

dp_compress:
type: bool, optional, default: False
argument path: simplify_jdata/dp_compress

Use dp compress to compress the model.

training_reuse_iter:
type: int | NoneType, optional
argument path: simplify_jdata/training_reuse_iter

The minimal index of iteration that continues training models from old models of last iteration.

training_reuse_old_ratio:
type: float | NoneType, optional
argument path: simplify_jdata/training_reuse_old_ratio

The probability proportion of old data during training. This option is only adopted when continuing training models from old models. This option will override default parameters.

training_reuse_numb_steps:
type: int | NoneType, optional, default: 400000, alias: training_reuse_stop_batch
argument path: simplify_jdata/training_reuse_numb_steps

Number of training batch. This option is only adopted when continuing training models from old models. This option will override default parameters.

training_reuse_start_lr:
type: float | NoneType, optional, default: 0.0001
argument path: simplify_jdata/training_reuse_start_lr

The learning rate the start of the training. This option is only adopted when continuing training models from old models. This option will override default parameters.

training_reuse_start_pref_e:
type: int | float | NoneType, optional, default: 0.1
argument path: simplify_jdata/training_reuse_start_pref_e

The prefactor of energy loss at the start of the training. This option is only adopted when continuing training models from old models. This option will override default parameters.

training_reuse_start_pref_f:
type: int | float | NoneType, optional, default: 100
argument path: simplify_jdata/training_reuse_start_pref_f

The prefactor of force loss at the start of the training. This option is only adopted when continuing training models from old models. This option will override default parameters.

model_devi_activation_func:
type: list | NoneType, optional
argument path: simplify_jdata/model_devi_activation_func

The activation function in the model. The shape of list should be (N_models, 2), where 2 represents the embedding and fitting network. This option will override default parameters.

fp_task_max:
type: int, optional
argument path: simplify_jdata/fp_task_max

Maximum of structures to be calculated in 02.fp of each iteration.

fp_task_min:
type: int, optional
argument path: simplify_jdata/fp_task_min

Minimum of structures to be calculated in 02.fp of each iteration.

fp_accurate_threshold:
type: float, optional
argument path: simplify_jdata/fp_accurate_threshold

If the accurate ratio is larger than this number, no fp calculation will be performed, i.e. fp_task_max = 0.

fp_accurate_soft_threshold:
type: float, optional
argument path: simplify_jdata/fp_accurate_soft_threshold

If the accurate ratio is between this number and fp_accurate_threshold, the fp_task_max linearly decays to zero.

Depending on the value of fp_style, different sub args are accepted.

fp_style:
type: str (flag key), default: none
argument path: simplify_jdata/fp_style
possible choices: none, vasp, gaussian

Software for First Principles, if labeled is false. Options include “vasp”, “gaussian” up to now.

When fp_style is set to none:

No fp.

When fp_style is set to vasp:

VASP.

fp_pp_path:
type: str
argument path: simplify_jdata[vasp]/fp_pp_path

Directory of psuedo-potential file to be used for 02.fp exists.

fp_pp_files:
type: list
argument path: simplify_jdata[vasp]/fp_pp_files

Psuedo-potential file to be used for 02.fp. Note that the order of elements should correspond to the order in type_map.

fp_incar:
type: str
argument path: simplify_jdata[vasp]/fp_incar

Input file for VASP. INCAR must specify KSPACING and KGAMMA.

fp_aniso_kspacing:
type: list, optional
argument path: simplify_jdata[vasp]/fp_aniso_kspacing

Set anisotropic kspacing. Usually useful for 1-D or 2-D materials. Only support VASP. If it is setting the KSPACING key in INCAR will be ignored.

cvasp:
type: bool, optional
argument path: simplify_jdata[vasp]/cvasp

If cvasp is true, DP-GEN will use Custodian to help control VASP calculation.

ratio_failed:
type: float, optional
argument path: simplify_jdata[vasp]/ratio_failed

Check the ratio of unsuccessfully terminated jobs. If too many FP tasks are not converged, RuntimeError will be raised.

fp_skip_bad_box:
type: str, optional
argument path: simplify_jdata[vasp]/fp_skip_bad_box

Skip the configurations that are obviously unreasonable before 02.fp

When fp_style is set to gaussian:

Gaussian. The command should be set as g16 < input.

use_clusters:
type: bool, optional, default: False
argument path: simplify_jdata[gaussian]/use_clusters

If set to true, clusters will be taken instead of the whole system.

cluster_cutoff:
type: float, optional
argument path: simplify_jdata[gaussian]/cluster_cutoff

The soft cutoff radius of clusters if use_clusters is set to true. Molecules will be taken as whole even if part of atoms is out of the cluster. Use cluster_cutoff_hard to only take atoms within the hard cutoff radius.

cluster_cutoff_hard:
type: float, optional
argument path: simplify_jdata[gaussian]/cluster_cutoff_hard

The hard cutoff radius of clusters if use_clusters is set to true. Outside the hard cutoff radius, atoms will not be taken even if they are in a molecule where some atoms are within the cutoff radius.

cluster_minify:
type: bool, optional, default: False
argument path: simplify_jdata[gaussian]/cluster_minify

If enabled, when an atom within the soft cutoff radius connects a single bond with a non-hydrogen atom out of the soft cutoff radius, the outer atom will be replaced by a hydrogen atom. When the outer atom is a hydrogen atom, the outer atom will be kept. In this case, other atoms out of the soft cutoff radius will be removed.

fp_params:
type: dict
argument path: simplify_jdata[gaussian]/fp_params

Parameters for Gaussian calculation.

keywords:
type: str | list
argument path: simplify_jdata[gaussian]/fp_params/keywords

Keywords for Gaussian input, e.g. force b3lyp/6-31g**. If a list, run multiple steps.

multiplicity:
type: str | int, optional, default: auto
argument path: simplify_jdata[gaussian]/fp_params/multiplicity

Spin multiplicity for Gaussian input. If auto, multiplicity will be detected automatically, with the following rules: when fragment_guesses=True, multiplicity will +1 for each radical, and +2 for each oxygen molecule; when fragment_guesses=False, multiplicity will be 1 or 2, but +2 for each oxygen molecule.

nproc:
type: int
argument path: simplify_jdata[gaussian]/fp_params/nproc

The number of processors for Gaussian input.

charge:
type: int, optional, default: 0
argument path: simplify_jdata[gaussian]/fp_params/charge

Molecule charge. Only used when charge is not provided by the system.

fragment_guesses:
type: bool, optional, default: False
argument path: simplify_jdata[gaussian]/fp_params/fragment_guesses

Initial guess generated from fragment guesses. If True, multiplicity should be auto.

basis_set:
type: str, optional
argument path: simplify_jdata[gaussian]/fp_params/basis_set

Custom basis set.

keywords_high_multiplicity:
type: str, optional
argument path: simplify_jdata[gaussian]/fp_params/keywords_high_multiplicity

Keywords for points with multiple raicals. multiplicity should be auto. If not set, fallback to normal keywords.

ratio_failed:
type: float, optional
argument path: simplify_jdata[gaussian]/ratio_failed

Check the ratio of unsuccessfully terminated jobs. If too many FP tasks are not converged, RuntimeError will be raised.