`deepmd.tf.utils`

Package Contents

Classes

`DeepmdData`	Class for a data system.
`DeepmdDataSystem`	Class for manipulating many data systems.
`LearningRateExp`	The exponentially decaying learning rate.
`PairTab`	Pairwise tabulated potential.
`Plugin`	A class to register and restore plugins.
`PluginVariant`	A class to remove type from input arguments.

class deepmd.tf.utils.DeepmdData(sys_path: str, set_prefix: str = 'set', shuffle_test: bool = True, type_map: List[str] | None = None, optional_type_map: bool = True, modifier=None, trn_all_set: bool = False, sort_atoms: bool = True)[source]

Class for a data system.

It loads data from hard disk, and mantains the data as a data_dict

Parameters:

sys_path: Path to the data system
set_prefix: Prefix for the directories of different sets
shuffle_test: If the test data are shuffled
type_map: Gives the name of different atom types
optional_type_map: If the type_map.raw in each system is optional
modifier: Data modifier that has the method modify_data
trn_all_set: Use all sets as training dataset. Otherwise, if the number of sets is more than 1, the last set is left for test.
sort_atomsbool: Sort atoms by atom types. Required to enable when the data is directly feeded to descriptors except mixed types.

add(key: str, ndof: int, atomic: bool = False, must: bool = False, high_prec: bool = False, type_sel: List[int] | None = None, repeat: int = 1, default: float = 0.0, dtype: numpy.dtype | None = None, output_natoms_for_type_sel: bool = False)[source]

Add a data item that to be loaded.

Parameters:

key: The key of the item. The corresponding data is stored in sys_path/set.*/key.npy
ndof: The number of dof
atomic: The item is an atomic property. If False, the size of the data should be nframes x ndof If True, the size of data should be nframes x natoms x ndof
must: The data file sys_path/set.*/key.npy must exist. If must is False and the data file does not exist, the data_dict[find_key] is set to 0.0
high_prec: Load the data and store in float64, otherwise in float32
type_sel: Select certain type of atoms
repeat: The data will be repeated repeat times.
defaultfloat, default=0.: default value of data
dtypenp.dtype, optional: the dtype of data, overwrites high_prec if provided
output_natoms_for_type_selbool, optional: if True and type_sel is True, the atomic dimension will be natoms instead of nsel

reduce(key_out: str, key_in: str)[source]

Generate a new item from the reduction of another atom.

Parameters:

key_out: The name of the reduced item
key_in: The name of the data item to be reduced

get_data_dict() → dict[source]: Get the data_dict.

check_batch_size(batch_size)[source]: Check if the system can get a batch of data with batch_size frames.

check_test_size(test_size)[source]: Check if the system can get a test dataset with test_size frames.

get_item_torch(index: int) → dict[source]

Get a single frame data . The frame is picked from the data system by index. The index is coded across all the sets.

Parameters:

index: index of the frame

get_batch(batch_size: int) → dict[source]

Get a batch of data with batch_size frames. The frames are randomly picked from the data system.

Parameters:

batch_size: size of the batch

get_test(ntests: int = -1) → dict[source]

Get the test data with ntests frames.

Parameters:

ntests: Size of the test data set. If ntests is -1, all test data will be get.

get_ntypes() → int[source]: Number of atom types in the system.

get_type_map() → List[str][source]: Get the type map.

get_atom_type() → List[int][source]: Get atom types.

get_numb_set() → int[source]: Get number of training sets.

get_numb_batch(batch_size: int, set_idx: int) → int[source]: Get the number of batches in a set.

get_sys_numb_batch(batch_size: int) → int[source]: Get the number of batches in the data system.

get_natoms()[source]: Get number of atoms.

get_natoms_vec(ntypes: int)[source]

Get number of atoms and number of atoms in different types.

Parameters:

ntypes: Number of types (may be larger than the actual number of types in the system).

Returns:

natoms: natoms[0]: number of local atoms natoms[1]: total number of atoms held by this processor natoms[i]: 2 <= i < Ntypes+2, number of type i atoms

avg(key)[source]: Return the average value of an item.

_idx_map_sel(atom_type, type_sel)[source]

_get_natoms_2(ntypes)[source]

_get_subdata(data, idx=None)[source]

_load_batch_set(set_name: deepmd.utils.path.DPPath)[source]

reset_get_batch()[source]

_load_test_set(set_name: deepmd.utils.path.DPPath, shuffle_test)[source]

_shuffle_data(data)[source]

_get_nframes(set_name: deepmd.utils.path.DPPath)[source]

reformat_data_torch(data)[source]

Modify the data format for the requirements of Torch backend.

Parameters:

data: original data

_load_set(set_name: deepmd.utils.path.DPPath)[source]

_load_data(set_name, key, nframes, ndof_, atomic=False, must=True, repeat=1, high_prec=False, type_sel=None, default: float = 0.0, dtype: numpy.dtype | None = None, output_natoms_for_type_sel: bool = False)[source]

_load_type(sys_path: deepmd.utils.path.DPPath)[source]

_load_type_mix(set_name: deepmd.utils.path.DPPath)[source]

_make_idx_map(atom_type)[source]

_load_type_map(sys_path: deepmd.utils.path.DPPath)[source]

_check_pbc(sys_path: deepmd.utils.path.DPPath)[source]

_check_mode(set_path: deepmd.utils.path.DPPath)[source]

class deepmd.tf.utils.DeepmdDataSystem(systems: List[str], batch_size: int, test_size: int, rcut: float | None = None, set_prefix: str = 'set', shuffle_test: bool = True, type_map: List[str] | None = None, optional_type_map: bool = True, modifier=None, trn_all_set=False, sys_probs=None, auto_prob_style='prob_sys_size', sort_atoms: bool = True)[source]

Class for manipulating many data systems.

It is implemented with the help of DeepmdData

property default_mesh: List[numpy.ndarray]: Mesh for each system.

_load_test(ntests=-1)[source]

compute_energy_shift(rcond=None, key='energy')[source]

add_dict(adict: dict) → None[source]

Add items to the data system by a dict. adict should have items like .. code-block:: python.

adict[key] = {
“ndof”: ndof, “atomic”: atomic, “must”: must, “high_prec”: high_prec, “type_sel”: type_sel, “repeat”: repeat,

}

For the explaination of the keys see add

add(key: str, ndof: int, atomic: bool = False, must: bool = False, high_prec: bool = False, type_sel: List[int] | None = None, repeat: int = 1, default: float = 0.0, dtype: numpy.dtype | None = None, output_natoms_for_type_sel: bool = False)[source]

Add a data item that to be loaded.

Parameters:

key: The key of the item. The corresponding data is stored in sys_path/set.*/key.npy
ndof: The number of dof
atomic: The item is an atomic property. If False, the size of the data should be nframes x ndof If True, the size of data should be nframes x natoms x ndof
must: The data file sys_path/set.*/key.npy must exist. If must is False and the data file does not exist, the data_dict[find_key] is set to 0.0
high_prec: Load the data and store in float64, otherwise in float32
type_sel: Select certain type of atoms
repeat: The data will be repeated repeat times.
default, default=0.: Default value of data
dtype: The dtype of data, overwrites high_prec if provided
output_natoms_for_type_selbool: If True and type_sel is True, the atomic dimension will be natoms instead of nsel

reduce(key_out, key_in)[source]

Generate a new item from the reduction of another atom.

Parameters:

key_out: The name of the reduced item
key_in: The name of the data item to be reduced

get_data_dict(ii: int = 0) → dict[source]

set_sys_probs(sys_probs=None, auto_prob_style: str = 'prob_sys_size')[source]

get_batch(sys_idx: int | None = None) → dict[source]

Get a batch of data from the data systems.

Parameters:

sys_idxint: The index of system from which the batch is get. If sys_idx is not None, sys_probs and auto_prob_style are ignored If sys_idx is None, automatically determine the system according to sys_probs or auto_prob_style, see the following. This option does not work for mixed systems.

Returns:

dict: The batch data

get_batch_standard(sys_idx: int | None = None) → dict[source]

Get a batch of data from the data systems in the standard way.

Parameters:

sys_idxint: The index of system from which the batch is get. If sys_idx is not None, sys_probs and auto_prob_style are ignored If sys_idx is None, automatically determine the system according to sys_probs or auto_prob_style, see the following.

Returns:

dict: The batch data

get_batch_mixed() → dict[source]

Get a batch of data from the data systems in the mixed way.

Returns:

dict: The batch data

_merge_batch_data(batch_data: List[dict]) → dict[source]

Merge batch data from different systems.

Parameters:

batch_datalist of dict: A list of batch data from different systems.

Returns:

dict: The merged batch data.

get_test(sys_idx: int | None = None, n_test: int = -1)[source]

Get test data from the the data systems.

Parameters:

sys_idx: The test dat of system with index sys_idx will be returned. If is None, the currently selected system will be returned.
n_test: Number of test data. If set to -1 all test data will be get.

get_sys_ntest(sys_idx=None)[source]: Get number of tests for the currently selected system, or one defined by sys_idx.

get_type_map() → List[str][source]: Get the type map.

get_nbatches() → int[source]: Get the total number of batches.

get_ntypes() → int[source]: Get the number of types.

get_nsystems() → int[source]: Get the number of data systems.

get_sys(idx: int) → deepmd.utils.data.DeepmdData[source]: Get a certain data system.

get_batch_size() → int[source]: Get the batch size.

print_summary(name: str)[source]

_make_auto_bs(rule)[source]

_make_auto_ts(percent)[source]

_check_type_map_consistency(type_map_list)[source]

class deepmd.tf.utils.LearningRateExp(start_lr: float, stop_lr: float = 5e-08, decay_steps: int = 5000, decay_rate: float = 0.95)[source]

The exponentially decaying learning rate.

The learning rate at step \(t\) is given by

\[\alpha(t) = \alpha_0 \lambda ^ { t / \tau }\]

where \(\alpha\) is the learning rate, \(\alpha_0\) is the starting learning rate, \(\lambda\) is the decay rate, and \(\tau\) is the decay steps.

Parameters:

start_lr: Starting learning rate \(\alpha_0\)
stop_lr: Stop learning rate \(\alpha_1\)
decay_steps: Learning rate decay every this number of steps \(\tau\)
decay_rate: The decay rate \(\lambda\). If stop_step is provided in build, then it will be determined automatically and overwritten.

build(global_step: deepmd.tf.env.tf.Tensor, stop_step: int | None = None) → deepmd.tf.env.tf.Tensor[source]

Build the learning rate.

Parameters:

global_step: The tf Tensor prividing the global training step
stop_step: The stop step. If provided, the decay_rate will be determined automatically and overwritten.

Returns:

learning_rate: The learning rate

start_lr() → float[source]: Get the start lr.

value(step: int) → float[source]: Get the lr at a certain step.

class deepmd.tf.utils.PairTab(filename: str, rcut: float | None = None)[source]

Pairwise tabulated potential.

Parameters:

filename: File name for the short-range tabulated potential. The table is a text data file with (N_t + 1) * N_t / 2 + 1 columes. The first colume is the distance between atoms. The second to the last columes are energies for pairs of certain types. For example we have two atom types, 0 and 1. The columes from 2nd to 4th are for 0-0, 0-1 and 1-1 correspondingly.

reinit(filename: str, rcut: float | None = None) → None[source]

Initialize the tabulated interaction.

Parameters:

filename: File name for the short-range tabulated potential. The table is a text data file with (N_t + 1) * N_t / 2 + 1 columes. The first colume is the distance between atoms. The second to the last columes are energies for pairs of certain types. For example we have two atom types, 0 and 1. The columes from 2nd to 4th are for 0-0, 0-1 and 1-1 correspondingly.

serialize() → dict[source]

classmethod deserialize(data) → PairTab[source]

_check_table_upper_boundary() → None[source]

Update User Provided Table Based on rcut.

This function checks the upper boundary provided in the table against rcut. If the table upper boundary values decay to zero before rcut, padding zeros will be added to the table to cover rcut; if the table upper boundary values do not decay to zero before ruct, extrapolation will be performed till rcut.

Examples

table = [[0.005 1. 2. 3. ]: [0.01 0.8 1.6 2.4 ] [0.015 0. 1. 1.5 ]]

rcut = 0.022

new_table = [[0.005 1. 2. 3. ]: [0.01 0.8 1.6 2.4 ] [0.015 0. 1. 1.5 ] [0.02 0. 0. 0. ]

table = [[0.005 1. 2. 3. ]: [0.01 0.8 1.6 2.4 ] [0.015 0.5 1. 1.5 ] [0.02 0.25 0.4 0.75 ] [0.025 0. 0.1 0. ] [0.03 0. 0. 0. ]]

rcut = 0.031

new_table = [[0.005 1. 2. 3. ]: [0.01 0.8 1.6 2.4 ] [0.015 0.5 1. 1.5 ] [0.02 0.25 0.4 0.75 ] [0.025 0. 0.1 0. ] [0.03 0. 0. 0. ] [0.035 0. 0. 0. ]]

get() → Tuple[numpy.array, numpy.array][source]: Get the serialized table.

_extrapolate_table(pad_extrapolation: numpy.array) → numpy.array[source]

Soomth extrapolation between table upper boundary and rcut.

This method should only be used when the table upper boundary rmax is smaller than rcut, and the table upper boundary values are not zeros. To simplify the problem, we use a single cubic spline between rmax and rcut for each pair of atom types. One can substitute this extrapolation to higher order polynomials if needed.

There are two scenarios:

ruct - rmax >= hh:
Set values at the grid point right before rcut to 0, and perform exterapolation between the grid point and rmax, this allows smooth decay to 0 at rcut.
rcut - rmax < hh:
Set values at rmax + hh to 0, and perform extrapolation between rmax and rmax + hh.

Parameters:

pad_extrapolationnp.array: The emepty grid that holds the extrapolation values.

Returns:

np.array: The cubic spline extrapolation.

_make_data()[source]

class deepmd.tf.utils.Plugin[source]

A class to register and restore plugins.

Examples

>>> plugin = Plugin()
>>> @plugin.register("xx")
    def xxx():
        pass
>>> print(plugin.plugins["xx"])

Attributes:

pluginsDict[str, object]: plugins

__add__(other) → Plugin[source]

register(key: str) → Callable[[object], object][source]

Register a plugin.

Parameters:

keystr: key of the plugin

Returns:

Callable[[object], object]: decorator

get_plugin(key) → object[source]

Visit a plugin by key.

Parameters:

keystr: key of the plugin

Returns:

object: the plugin

class deepmd.tf.utils.PluginVariant[source]: A class to remove type from input arguments.

deepmd.tf.utils

Submodules

Package Contents

Classes

`deepmd.tf.utils`