deepmd.train package

Submodules

deepmd.train.run_options module

Module taking care of important package constants.

class deepmd.train.run_options.RunOptions(init_model: Optional[str] = None, init_frz_model: Optional[str] = None, finetune: Optional[str] = None, restart: Optional[str] = None, log_path: Optional[str] = None, log_level: int = 0, mpi_log: str = 'master')[source]

Bases: object

Class with info on how to run training (cluster, MPI and GPU config).

Attributes
gpus: Optional[List[int]]

list of GPUs if any are present else None

is_chief: bool

in distribured training it is true for tha main MPI process in serail it is always true

world_size: int

total worker count

my_rank: int

index of the MPI task

nodename: str

name of the node

node_list_List[str]

the list of nodes of the current mpirun

my_device: str

deviice type - gpu or cpu

Methods

print_resource_summary()

Print build and current running cluster configuration summary.

gpus: Optional[List[int]]
property is_chief

Whether my rank is 0.

my_device: str
my_rank: int
nodelist: List[int]
nodename: str
print_resource_summary()[source]

Print build and current running cluster configuration summary.

world_size: int

deepmd.train.trainer module

class deepmd.train.trainer.DPTrainer(jdata, run_opt, is_compress=False)[source]

Bases: object

Methods

save_compressed()

Save the compressed graph.

build

eval_single_list

get_evaluation_results

get_feed_dict

get_global_step

print_header

print_on_training

save_checkpoint

train

valid_on_the_fly

build(data=None, stop_batch=0, origin_type_map=None, suffix='')[source]
static eval_single_list(single_batch_list, loss, sess, get_feed_dict_func, prefix='')[source]
get_evaluation_results(batch_list)[source]
get_feed_dict(batch, is_training)[source]
get_global_step()[source]
static print_header(fp, train_results, valid_results, multi_task_mode=False)[source]
static print_on_training(fp, train_results, valid_results, cur_batch, cur_lr, multi_task_mode=False, cur_lr_dict=None)[source]
save_checkpoint(cur_batch: int)[source]
save_compressed()[source]

Save the compressed graph.

train(train_data=None, valid_data=None)[source]
valid_on_the_fly(fp, train_batches, valid_batches, print_header=False, fitting_key=None)[source]
class deepmd.train.trainer.DatasetLoader(train_data: DeepmdDataSystem)[source]

Bases: object

Generate an OP that loads the training data from the given DeepmdDataSystem.

It can be used to load the training data in the training process, so there is no waiting time between training steps.

Parameters
train_dataDeepmdDataSystem

The training data.

Examples

>>> loader = DatasetLoader(train_data)
>>> data_op = loader.build()
>>> with tf.Session() as sess:
>>>     data_list = sess.run(data_op)
>>> data_dict = loader.get_data_dict(data_list)

Methods

build()

Build the OP that loads the training data.

get_data_dict(batch_list)

Generate a dict of the loaded data.

build() List[Tensor][source]

Build the OP that loads the training data.

Returns
List[tf.Tensor]

Tensor of the loaded data.

get_data_dict(batch_list: List[ndarray]) Dict[str, ndarray][source]

Generate a dict of the loaded data.

Parameters
batch_listList[np.ndarray]

The loaded data.

Returns
Dict[str, np.ndarray]

The dict of the loaded data.