DPGEN2 configurations

Op configs

RunDPTrain

impl:
type: str, optional, default: tensorflow, alias: backend
argument path: impl

The implementation/backend of DP. It can be ‘tensorflow’ or ‘pytorch’. ‘tensorflow’ for default.

init_model_policy:
type: str, optional, default: no
argument path: init_model_policy

The policy of init-model training. It can be

  • ‘no’: No init-model training. Traing from scratch.

  • ‘yes’: Do init-model training.

  • ‘old_data_larger_than:XXX’: Do init-model if the training data size of the previous model is larger than XXX. XXX is an int number.

init_model_old_ratio:
type: float, optional, default: 0.9
argument path: init_model_old_ratio

The frequency ratio of old data over new data

init_model_numb_steps:
type: int, optional, default: 400000, alias: init_model_stop_batch
argument path: init_model_numb_steps

The number of training steps when init-model

init_model_start_lr:
type: float, optional, default: 0.0001
argument path: init_model_start_lr

The start learning rate when init-model

init_model_start_pref_e:
type: float, optional, default: 0.1
argument path: init_model_start_pref_e

The start energy prefactor in loss when init-model

init_model_start_pref_f:
type: float, optional, default: 100
argument path: init_model_start_pref_f

The start force prefactor in loss when init-model

init_model_start_pref_v:
type: float, optional, default: 0.0
argument path: init_model_start_pref_v

The start virial prefactor in loss when init-model

init_model_with_finetune:
type: bool, optional, default: False
argument path: init_model_with_finetune

Use finetune for init model

finetune_args:
type: str, optional, default: (empty string)
argument path: finetune_args

Extra arguments for finetuning

multitask:
type: bool, optional, default: False
argument path: multitask

Do multitask training

multi_init_data_idx:
type: dict | NoneType, optional, default: None
argument path: multi_init_data_idx

A dict mapping from task name to list of indices in the init data

RunLmp

command:
type: str, optional, default: lmp
argument path: command

The command of LAMMPS

teacher_model_path:
type: str | BinaryFileInput | NoneType, optional, default: None
argument path: teacher_model_path

The teacher model in Knowledge Distillation

shuffle_models:
type: bool, optional, default: False
argument path: shuffle_models

Randomly pick a model from the group of models to drive theexploration MD simulation

head:
type: str | NoneType, optional, default: None
argument path: head

Select a head from multitask

RunVasp

Alloy configs

Task group configs

Step configs

template_config:
type: dict, optional, default: {'image': 'dptechnology/dpgen2:latest'}
argument path: template_config

The configs passed to the PythonOPTemplate.

image:
type: str, optional, default: dptechnology/dpgen2:latest
argument path: template_config/image

The image to run the step.

timeout:
type: NoneType | int, optional, default: None
argument path: template_config/timeout

The time limit of the OP. Unit is second.

retry_on_transient_error:
type: NoneType | int, optional, default: None
argument path: template_config/retry_on_transient_error

The number of retry times if a TransientError is raised.

timeout_as_transient_error:
type: bool, optional, default: False
argument path: template_config/timeout_as_transient_error

Treat the timeout as TransientError.

envs:
type: dict | NoneType, optional, default: None
argument path: template_config/envs

The environmental variables.

template_slice_config:
type: dict, optional
argument path: template_slice_config

The configs passed to the Slices.

group_size:
type: NoneType | int, optional, default: None
argument path: template_slice_config/group_size

The number of tasks running on a single node. It is efficient for a large number of short tasks.

pool_size:
type: NoneType | int, optional, default: None
argument path: template_slice_config/pool_size

The number of tasks running at the same time on one node.

continue_on_failed:
type: bool, optional, default: False
argument path: continue_on_failed

If continue the the step is failed (FatalError, TransientError, A certain number of retrial is reached…).

continue_on_num_success:
type: NoneType | int, optional, default: None
argument path: continue_on_num_success

Only in the sliced OP case. Continue the workflow if a certain number of the sliced jobs are successful.

continue_on_success_ratio:
type: NoneType | float, optional, default: None
argument path: continue_on_success_ratio

Only in the sliced OP case. Continue the workflow if a certain ratio of the sliced jobs are successful.

parallelism:
type: NoneType | int, optional, default: None
argument path: parallelism

The parallelism for the step

executor:
type: dict | NoneType, optional, default: None
argument path: executor

The executor of the step.

Depending on the value of type, different sub args are accepted.

type:
type: str (flag key)
argument path: executor/type
possible choices: dispatcher

The type of the executor.

When type is set to dispatcher: