dpgen2.op package

Submodules

dpgen2.op.collect_data module

class dpgen2.op.collect_data.CollectData(*args, **kwargs)[source]

Bases: OP

Collect labeled data and add to the iteration dataset.

After running FP tasks, the labeled data are scattered in task directories. This OP collect the labeled data in one data directory and add it to the iteration data. The data generated by this iteration will be place in ip[“name”] subdirectory of the iteration data directory.

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
register_output_artifact
superfunction

default_optional_parameter = {'mixed_type': False}

execute(ip: OPIO) → OPIO[source]

Execute the OP. This OP collect data scattered in directories given by ip[‘labeled_data’] in to one dpdata.Multisystems and store it in a directory named name. This directory is appended to the list iter_data.

Parameters:

ipdict

Input dict with components:

name: (str) The name of this iteration. The data generated by this iteration will be place in a sub-directory of name.
labeled_data: (Artifact(List[Path])) The paths of labeled data generated by FP tasks of the current iteration.
iter_data: (Artifact(List[Path])) The data paths previous iterations.

Returns:

Any: Output dict with components: - iter_data: (Artifact(List[Path])) The data paths of previous and the current iteration data.

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

dpgen2.op.collect_run_caly module

class dpgen2.op.collect_run_caly.CollRunCaly(*args, **kwargs)[source]

Bases: OP

Execute CALYPSO to generate structures in work_dir.

Changing the work directory into task_name. All input files have been copied or symbol linked to this directory task_name by PrepCalyInput. The CALYPSO command is exectuted from directory task_name. The caly.log and the work_dir will be stored in op[“log”] and op[“work_dir”], respectively.

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

calypso_args
convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
normalize_config
register_output_artifact
superfunction

static calypso_args()[source]

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict

Input dict with components:

config: (dict) The config of calypso task to obtain the command of calypso.
task_name: (str) The name of the task (calypso_task.{idx}).
input_file: (Path) The input file of the task (input.dat).
step: (Path) The step file from last calypso run
results: (Path) The results dir from last calypso run
opt_results_dir: (Path) The results dir contains POSCAR* CONTCAR* OUTCAR* from last calypso run
qhull_input: (Path) qhull input file test_qconvex.in

Returns:

Any

Output dict with components: - poscar_dir: (Path) The dir contains POSCAR*.

task_name: (str) The name of the task (calypso_task.{idx}).
input_file: (Path) The input file of the task (input.dat).
step: (Path) The step file.
results: (Path) The results dir.
qhull_input: (Path) qhull input file.

Raises:

TransientError: On the failure of CALYPSO execution. Resubmit rule should be clear.

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

static normalize_config(data={})[source]

dpgen2.op.collect_run_caly.config_args()

dpgen2.op.collect_run_caly.get_value_from_inputdat(filename)[source]

dpgen2.op.collect_run_caly.prep_last_calypso_file(step, results, opt_results_dir, qhull_input, vsc)[source]

dpgen2.op.md_settings module

class dpgen2.op.md_settings.MDSettings(ens: str, dt: float, nsteps: int, trj_freq: int, temps: List[float] | None = None, press: List[float] | None = None, tau_t: float = 0.1, tau_p: float = 0.5, pka_e: float | None = None, neidelay: int | None = None, no_pbc: bool = False, use_clusters: bool = False, relative_epsilon: float | None = None, relative_v_epsilon: float | None = None, ele_temp_f: float | None = None, ele_temp_a: float | None = None)[source]

Bases: object

Methods

to_str

to_str() → str[source]

dpgen2.op.prep_caly_dp_optim module

class dpgen2.op.prep_caly_dp_optim.PrepCalyDPOptim(*args, **kwargs)[source]

Bases: OP

Prepare the working directories and input file according to slices information for structure optimization with DP.

POSCAR_*, frozen_model.pb or model.ckpt.pt, calypso_run_opt.py and calypso_check_opt.py will be copied or symlink to each optimization directory from ip[“work_path”], according to the group size of ip[“template_slice_config”]. The POSCAR_* will be splited into group_size parts and the name of each path will be returned in a task_names list and task_dirs list.

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
register_output_artifact
superfunction

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict: Input dict with components: - task_name : (str) - finished : (str) - template_slice_config : (dict) - poscar_dir : (Path) - models_dir : (Path) - caly_run_opt_file : (Path) - caly_check_opt_file : (Path)

Returns:

opdict

Output dict with components:

task_names: (List[str])
task_dirs: (Artifact(List[Path]))
caly_run_opt_file : (Path)
caly_check_opt_file : (Path)

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

dpgen2.op.prep_caly_input module

class dpgen2.op.prep_caly_input.PrepCalyInput(*args, **kwargs)[source]

Bases: OP

Prepare the working directories and input file for generating structures.

A calypso input file will be generated according to the given parameters (defined by ip[“caly_inputs”]). The artifact will be return (ip[input_files]). The name of directory is ip[“task_names”].

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
register_output_artifact
superfunction

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict: Input dict with components: - caly_task_grp : (BigParameter()) Definitions for CALYPSO input file.

Returns:

opdict

Output dict with components:

task_names: (List[str]) The name of CALYPSO tasks. Will be used as the identities of the tasks. The names of different tasks are different.
input_dat_files: (Artifact(List[Path])) The parepared working paths of the task containing input files (input.dat and calypso_run_opt.py) needed to generate structures by CALYPSO and make structure optimization with DP model.
caly_run_opt_files: (Artifact(List[Path]))
caly_check_opt_files: (Artifact(List[Path]))

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

dpgen2.op.prep_dp_train module

class dpgen2.op.prep_dp_train.PrepDPTrain(*args, **kwargs)[source]

Bases: OP

Prepares the working directories for DP training tasks.

A list of (numb_models) working directories containing all files needed to start training tasks will be created. The paths of the directories will be returned as op[“task_paths”]. The identities of the tasks are returned as op[“task_names”].

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
register_output_artifact
superfunction

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict

Input dict with components:

template_script: (str or List[str]) A template of the training script. Can be a str or List[str]. In the case of str, all training tasks share the same training input template, the only difference is the random number used to initialize the network parameters. In the case of List[str], one training task uses one template from the list. The random numbers used to initialize the network parameters are differnt. The length of the list should be the same as numb_models.
numb_models: (int) Number of DP models to train.

Returns:

opdict

Output dict with components:

task_names: (List[str]) The name of tasks. Will be used as the identities of the tasks. The names of different tasks are different.
task_paths: (Artifact(List[Path])) The parepared working paths of the tasks. The order fo the Paths should be consistent with op[“task_names”]

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

dpgen2.op.prep_lmp module

dpgen2.op.prep_lmp.PrepExplorationTaskGroup: alias of PrepLmp

class dpgen2.op.prep_lmp.PrepLmp(*args, **kwargs)[source]

Bases: OP

Prepare the working directories for LAMMPS tasks.

A list of working directories (defined by ip[“task”]) containing all files needed to start LAMMPS tasks will be created. The paths of the directories will be returned as op[“task_paths”]. The identities of the tasks are returned as op[“task_names”].

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
register_output_artifact
superfunction

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict: Input dict with components: - lmp_task_grp : (BigParameter(Path)) Can be pickle loaded as a ExplorationTaskGroup. Definitions for LAMMPS tasks

Returns:

opdict

Output dict with components:

task_names: (List[str]) The name of tasks. Will be used as the identities of the tasks. The names of different tasks are different.
task_paths: (Artifact(List[Path])) The parepared working paths of the tasks. Contains all input files needed to start the LAMMPS simulation. The order fo the Paths should be consistent with op[“task_names”]

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

dpgen2.op.run_caly_dp_optim module

class dpgen2.op.run_caly_dp_optim.RunCalyDPOptim(*args, **kwargs)[source]

Bases: OP

Perform structure optimization with DP in ip[“work_path”].

The optim_results_dir and traj_results will be returned as op[“optim_results_dir”] and op[“traj_results”].

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
register_output_artifact
superfunction

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict: Input dict with components: - config: (dict) The config of calypso task to obtain the command of calypso. - task_name : (str) - finished : (str) - cnt_num : (int) - task_dir : (Path)

Returns:

opdict

Output dict with components:

task_name: (str)
optim_results_dir: (List[str])
traj_results: (Artifact(List[Path]))

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

dpgen2.op.run_caly_model_devi module

class dpgen2.op.run_caly_model_devi.RunCalyModelDevi(*args, **kwargs)[source]

Bases: OP

calculate model deviaion of trajectories structures.

Structure optimization will be executed in optim_path. The trajectory will be stored in files op[“traj”] and op[“model_devi”], respectively.

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
register_output_artifact
superfunction

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict: Input dict with components: - type_map: (List[str]) The type map of elements. - task_name: (str) The name of the task. - traj_dirs: (Artifact(List[Path])) The List of paths that contains trajectory files. - models: (Artifact(List[Path])) The frozen model to estimate the model deviation.

Returns:

Any: Output dict with components: - task_name: (str) The name of task. - traj: (Artifact(List[Path])) The output trajectory. - model_devi: (Artifact(List[Path])) The model deviation. The order of recorded model deviations should be consistent with the order of frames in traj.

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

dpgen2.op.run_caly_model_devi.atoms2lmpdump(atoms, struc_idx, type_map, ignore=False)[source]

down triangle cell can be obtained from cell params: a, b, c, alpha, beta, gamma. cell = cellpar_to_cell([a, b, c, alpha, beta, gamma]) lx, ly, lz = cell[0][0], cell[1][1], cell[2][2] xy, xz, yz = cell[1][0], cell[2][0], cell[2][1] (lx,ly,lz) = (xhi-xlo,yhi-ylo,zhi-zlo) xlo_bound = xlo + MIN(0.0,xy,xz,xy+xz) xhi_bound = xhi + MAX(0.0,xy,xz,xy+xz) ylo_bound = ylo + MIN(0.0,yz) yhi_bound = yhi + MAX(0.0,yz) zlo_bound = zlo zhi_bound = zhi

ref: https://docs.lammps.org/Howto_triclinic.html

dpgen2.op.run_caly_model_devi.parse_traj(traj_file)[source]

dpgen2.op.run_caly_model_devi.write_model_devi_out(devi: ndarray, fname: str | Path, header: str = '')[source]

dpgen2.op.run_dp_train module

class dpgen2.op.run_dp_train.RunDPTrain(*args, **kwargs)[source]

Bases: OP

Execute a DP training task. Train and freeze a DP model.

A working directory named task_name is created. All input files are copied or symbol linked to directory task_name. The DeePMD-kit training and freezing commands are exectuted from directory task_name.

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
decide_init_model
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
normalize_config
register_output_artifact
skip_training
superfunction
training_args
write_data_to_input_script
write_other_to_input_script

static decide_init_model(config, init_model, init_data, iter_data, mixed_type=False)[source]

default_optional_parameter = {'finetune_mode': 'no', 'mixed_type': False}

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict

Input dict with components:

config: (dict) The config of training task. Check RunDPTrain.training_args for definitions.
task_name: (str) The name of training task.
task_path: (Artifact(Path)) The path that contains all input files prepareed by PrepDPTrain.
init_model: (Artifact(Path)) A frozen model to initialize the training.
init_data: (Artifact(List[Path])) Initial training data.
iter_data: (Artifact(List[Path])) Training data generated in the DPGEN iterations.

Returns:

Any: Output dict with components: - script: (Artifact(Path)) The training script. - model: (Artifact(Path)) The trained frozen model. - lcurve: (Artifact(Path)) The learning curve file. - log: (Artifact(Path)) The log file of training.

Raises:

FatalError: On the failure of training or freezing. Human intervention needed.

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

static normalize_config(data={})[source]

static skip_training(work_dir, train_dict, init_model, iter_data, finetune_mode)[source]

static training_args()[source]

static write_data_to_input_script(idict: dict, config, init_data: List[Path], iter_data: List[Path], auto_prob_str: str = 'prob_sys_size', major_version: str = '1', valid_data: List[Path] | None = None)[source]

static write_other_to_input_script(idict, config, do_init_model, major_version: str = '1')[source]

dpgen2.op.run_dp_train.config_args()

dpgen2.op.run_lmp module

class dpgen2.op.run_lmp.RunLmp(*args, **kwargs)[source]

Bases: OP

Execute a LAMMPS task.

A working directory named task_name is created. All input files are copied or symbol linked to directory task_name. The LAMMPS command is exectuted from directory task_name. The trajectory and the model deviation will be stored in files op[“traj”] and op[“model_devi”], respectively.

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
lmp_args
normalize_config
register_output_artifact
superfunction

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict

Input dict with components:

config: (dict) The config of lmp task. Check RunLmp.lmp_args for definitions.
task_name: (str) The name of the task.
task_path: (Artifact(Path)) The path that contains all input files prepareed by PrepLmp.
models: (Artifact(List[Path])) The frozen model to estimate the model deviation. The first model with be used to drive molecular dynamics simulation.

Returns:

Any: Output dict with components: - log: (Artifact(Path)) The log file of LAMMPS. - traj: (Artifact(Path)) The output trajectory. - model_devi: (Artifact(Path)) The model deviation. The order of recorded model deviations should be consistent with the order of frames in traj.

Raises:

TransientError: On the failure of LAMMPS execution. Handle different failure cases? e.g. loss atoms.

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

static lmp_args()[source]

static normalize_config(data={})[source]

dpgen2.op.run_lmp.config_args()

dpgen2.op.run_lmp.find_only_one_key(lmp_lines, key, raise_not_found=True)[source]

dpgen2.op.run_lmp.set_models(lmp_input_name: str, model_names: List[str])[source]

dpgen2.op.select_confs module

class dpgen2.op.select_confs.SelectConfs(*args, **kwargs)[source]

Bases: OP

Select configurations from exploration trajectories for labeling.

Attributes:

key
workflow_name

Methods

`execute`(ip)	Execute the OP.
`get_input_sign`()	Get the signature of the inputs
`get_output_sign`()	Get the signature of the outputs

convert_to_graph
exec_sign_check
from_graph
function
get_info
get_input_artifact_link
get_input_artifact_storage_key
get_opio_info
get_output_artifact_link
get_output_artifact_storage_key
register_output_artifact
superfunction
validate_trajs

execute(ip: OPIO) → OPIO[source]

Execute the OP.

Parameters:

ipdict

Input dict with components:

conf_selector: (ConfSelector) Configuration selector.
type_map: (List[str]) The type map.
trajs: (Artifact(List[Path])) The trajectories generated in the exploration.
model_devis: (Artifact(List[Path])) The file storing the model deviation of the trajectory. The order of model deviation storage is consistent with that of the trajectories. The order of frames of one model deviation storage is also consistent with tat of the corresponding trajectory.

Returns:

Any: Output dict with components: - report: (ExplorationReport) The report on the exploration. - conf: (Artifact(List[Path])) The selected configurations.

classmethod get_input_sign()[source]: Get the signature of the inputs

classmethod get_output_sign()[source]: Get the signature of the outputs

static validate_trajs(trajs, model_devis)[source]