deepmd.pd.utils.stat

deepmd.pd.utils.stat#

Attributes#

log

Functions#

`make_stat_input`(→ dict[str, Any])	Pack data for statistics.
`_restore_from_file`(→ dict \| None)
`_save_to_file`(→ None)
`_post_process_stat`(→ tuple[paddle.Tensor, paddle.Tensor])	Post process the statistics.
`_compute_model_predict`(→ dict[str, list[paddle.Tensor]])
`_make_preset_out_bias`(→ numpy.ndarray \| None)	Make preset out bias.
`_fill_stat_with_global`(→ numpy.ndarray \| None)	This function is used to fill atomic stat with global stat.
`compute_output_stats`(→ dict[str, Any])	Compute the output statistics (e.g. energy bias) for the fitting net from packed data.
`_compute_output_stats_global`(→ tuple[dict[str, ...)	This function only handle stat computation from reduced global labels.
`_compute_output_stats_atomic`(→ tuple[dict[str, ...)	Compute output statistics from atomic labels.

Module Contents#

deepmd.pd.utils.stat.log[source]#

deepmd.pd.utils.stat.make_stat_input(datasets: list[Any], dataloaders: list[Any], nbatches: int) → dict[str, Any][source]#

Pack data for statistics.

Args: - dataset: A list of dataset to analyze. - nbatches: Batch count for collecting stats.

Returns:

a list of dicts, each of which contains data from a system

deepmd.pd.utils.stat._restore_from_file(stat_file_path: deepmd.utils.path.DPPath, keys: list[str] = ['energy']) → dict | None[source]#

deepmd.pd.utils.stat._save_to_file(stat_file_path: deepmd.utils.path.DPPath, bias_out: dict, std_out: dict) → None[source]#

deepmd.pd.utils.stat._post_process_stat(out_bias: paddle.Tensor, out_std: paddle.Tensor) → tuple[paddle.Tensor, paddle.Tensor][source]#

Post process the statistics.

For global statistics, we do not have the std for each type of atoms, thus fake the output std by ones for all the types. If the shape of out_std is already the same as out_bias, we do not need to do anything.

deepmd.pd.utils.stat._compute_model_predict(sampled: collections.abc.Callable[[], list[dict]] | list[dict], keys: list[str], model_forward: collections.abc.Callable[Ellipsis, paddle.Tensor]) → dict[str, list[paddle.Tensor]][source]#

deepmd.pd.utils.stat._make_preset_out_bias(ntypes: int, ibias: list[numpy.ndarray | None]) → numpy.ndarray | None[source]#

Make preset out bias.

output:: a np array of shape [ntypes, *(odim0, odim1, …)] is any item is not None None if all items are None.

deepmd.pd.utils.stat._fill_stat_with_global(atomic_stat: numpy.ndarray | None, global_stat: numpy.ndarray) → numpy.ndarray | None[source]#

This function is used to fill atomic stat with global stat.

Parameters:

atomic_statUnion[np.ndarray, None]: The atomic stat.
global_statnp.ndarray: The global stat.
if the atomic stat is None, use global stat.
if the atomic stat is not None, but has nan values (missing atypes), fill with global stat.

deepmd.pd.utils.stat.compute_output_stats(merged: collections.abc.Callable[[], list[dict]] | list[dict], ntypes: int, keys: str | list[str] = ['energy'], stat_file_path: deepmd.utils.path.DPPath | None = None, rcond: float | None = None, preset_bias: dict[str, list[numpy.ndarray | None]] | None = None, model_forward: collections.abc.Callable[Ellipsis, paddle.Tensor] | None = None, stats_distinguish_types: bool = True, intensive: bool = False) → dict[str, Any][source]#

Compute the output statistics (e.g. energy bias) for the fitting net from packed data.

Parameters:

mergedUnion[Callable[[], list[dict]], list[dict]]

list[dict]: A list of data samples from various data systems.
Each element, merged[i], is a data dictionary containing keys: paddle.Tensor originating from the i-th data system.
Callable[[], list[dict]]: A lazy function that returns data samples in the above format
only when needed. Since the sampling process can be slow and memory-intensive, the lazy function helps by only sampling once.

ntypesint

The number of atom types.

stat_file_pathDPPath, optional

The path to the stat file.

rcondfloat, optional

The condition number for the regression of atomic energy.

preset_biasdict[str, list[Optional[paddle.Tensor]]], optional

Specifying atomic energy contribution in vacuum. Given by key:value pairs. The value is a list specifying the bias. the elements can be None or np.ndarray of output shape. For example: [None, [2.]] means type 0 is not set, type 1 is set to [2.] The set_davg_zero key in the descriptor should be set.

model_forwardCallable[…, paddle.Tensor], optional

The wrapped forward function of atomic model. If not None, the model will be utilized to generate the original energy prediction, which will be subtracted from the energy label of the data. The difference will then be used to calculate the delta complement energy bias for each type.

stats_distinguish_typesbool, optional

Whether to distinguish different element types in the statistics.

intensivebool, optional

Whether the fitting target is intensive.

deepmd.pd.utils.stat._compute_output_stats_global(sampled: list[dict], ntypes: int, keys: list[str], rcond: float | None = None, preset_bias: dict[str, list[paddle.Tensor | None]] | None = None, global_sampled_idx: dict | None = None, stats_distinguish_types: bool = True, intensive: bool = False, model_pred: dict[str, numpy.ndarray] | None = None) → tuple[dict[str, numpy.ndarray], dict[str, numpy.ndarray]][source]#: This function only handle stat computation from reduced global labels.

deepmd.pd.utils.stat._compute_output_stats_atomic(sampled: list[dict], ntypes: int, keys: list[str], atomic_sampled_idx: dict | None = None, model_pred: dict[str, numpy.ndarray] | None = None) → tuple[dict[str, numpy.ndarray], dict[str, numpy.ndarray]][source]#: Compute output statistics from atomic labels.