deepmd.tf.utils.data_system

deepmd.tf.utils.data_system#

Alias for backward compatibility.

Classes#

DeepmdDataSystem

Class for manipulating many data systems.

Functions#

`prob_sys_size_ext`(→ list[float])
`process_sys_probs`(→ numpy.ndarray)

Module Contents#

class deepmd.tf.utils.data_system.DeepmdDataSystem(systems: list[str], batch_size: int, test_size: int, rcut: float | None = None, set_prefix: str = 'set', shuffle_test: bool = True, type_map: list[str] | None = None, optional_type_map: bool = True, modifier: Any | None = None, trn_all_set: bool = False, sys_probs: list[float] | None = None, auto_prob_style: str = 'prob_sys_size', sort_atoms: bool = True)[source]#

Class for manipulating many data systems.

It is implemented with the help of DeepmdData

system_dirs#

nsystems#

data_systems = []#

batch_size#

mixed_systems = False#

sys_ntypes#

natoms = []#

natoms_vec = []#

nbatches = []#

type_map = []#

test_size#

pick_idx = 0#

sys_probs = None#

_load_test(ntests: int = -1) → None[source]#

property default_mesh: list[numpy.ndarray]#: Mesh for each system.

compute_energy_shift(rcond: float | None = None, key: str = 'energy') → tuple[numpy.ndarray, numpy.ndarray][source]#

add_dict(adict: dict[str, dict[str, Any]]) → None[source]#

Add items to the data system by a dict. adict should have items like .. code-block:: python.

adict[key] = {
“ndof”: ndof, “atomic”: atomic, “must”: must, “high_prec”: high_prec, “type_sel”: type_sel, “repeat”: repeat,
}

For the explanation of the keys see add

add_data_requirements(data_requirements: list[deepmd.utils.data.DataRequirementItem]) → None[source]#: Add items to the data system by a list of DataRequirementItem.

add(key: str, ndof: int, atomic: bool = False, must: bool = False, high_prec: bool = False, type_sel: list[int] | None = None, repeat: int = 1, default: float = 0.0, dtype: numpy.dtype | None = None, output_natoms_for_type_sel: bool = False) → None[source]#

Add a data item that to be loaded.

Parameters:

key: The key of the item. The corresponding data is stored in sys_path/set.*/key.npy
ndof: The number of dof
atomic: The item is an atomic property. If False, the size of the data should be nframes x ndof If True, the size of data should be nframes x natoms x ndof
must: The data file sys_path/set.*/key.npy must exist. If must is False and the data file does not exist, the data_dict[find_key] is set to 0.0
high_prec: Load the data and store in float64, otherwise in float32
type_sel: Select certain type of atoms
repeat: The data will be repeated repeat times.
default, default=0.: Default value of data
dtype: The dtype of data, overwrites high_prec if provided
output_natoms_for_type_selbool: If True and type_sel is True, the atomic dimension will be natoms instead of nsel

reduce(key_out: str, key_in: str) → None[source]#

Generate a new item from the reduction of another atom.

Parameters:

key_out: The name of the reduced item
key_in: The name of the data item to be reduced

get_data_dict(ii: int = 0) → dict[source]#

set_sys_probs(sys_probs: list[float] | None = None, auto_prob_style: str = 'prob_sys_size') → None[source]#

get_batch(sys_idx: int | None = None) → dict[source]#

Get a batch of data from the data systems.

Parameters:

sys_idxint: The index of system from which the batch is get. If sys_idx is not None, sys_probs and auto_prob_style are ignored If sys_idx is None, automatically determine the system according to sys_probs or auto_prob_style, see the following. This option does not work for mixed systems.

Returns:

dict: The batch data

get_batch_standard(sys_idx: int | None = None) → dict[source]#

Get a batch of data from the data systems in the standard way.

Parameters:

sys_idxint: The index of system from which the batch is get. If sys_idx is not None, sys_probs and auto_prob_style are ignored If sys_idx is None, automatically determine the system according to sys_probs or auto_prob_style, see the following.

Returns:

dict: The batch data

get_batch_mixed() → dict[source]#

Get a batch of data from the data systems in the mixed way.

Returns:

dict: The batch data

_merge_batch_data(batch_data: list[dict]) → dict[source]#

Merge batch data from different systems.

Parameters:

batch_datalist of dict: A list of batch data from different systems.

Returns:

dict: The merged batch data.

get_test(sys_idx: int | None = None, n_test: int = -1) → dict[str, numpy.ndarray][source]#

Get test data from the the data systems.

Parameters:

sys_idx: The test dat of system with index sys_idx will be returned. If is None, the currently selected system will be returned.
n_test: Number of test data. If set to -1 all test data will be get.

get_sys_ntest(sys_idx: int | None = None) → int[source]#: Get number of tests for the currently selected system, or one defined by sys_idx.

get_type_map() → list[str][source]#: Get the type map.

get_nbatches() → int[source]#: Get the total number of batches.

get_ntypes() → int[source]#: Get the number of types.

get_nsystems() → int[source]#: Get the number of data systems.

get_sys(idx: int) → deepmd.utils.data.DeepmdData[source]#: Get a certain data system.

get_batch_size() → int[source]#: Get the batch size.

print_summary(name: str) → None[source]#

_make_auto_bs(rule: int) → list[int][source]#

_make_auto_ts(percent: float) → list[int][source]#

_check_type_map_consistency(type_map_list: list[list[str] | None]) → list[str][source]#

deepmd.tf.utils.data_system.prob_sys_size_ext(keywords: str, nsystems: int, nbatch: int) → list[float][source]#

deepmd.tf.utils.data_system.process_sys_probs(sys_probs: list[float], nbatch: int) → numpy.ndarray[source]#