dpdata.lmdb package#

class dpdata.lmdb.LMDBFormat[source]#

Bases: Format

Class for handling the LMDB format, which stores atomic configurations in a Lightning Memory-Mapped Database (LMDB).

This format is optimized for machine learning workflows where fast, random access to a large number of frames is required. All frames from multiple systems (with potentially different numbers of atoms) are stored in a single LMDB database file.

Both single systems and multiple systems are supported via the standard dpdata APIs.

Methods

MultiModes()

File mode for MultiSystems.

from_bond_order_system(file_name, **kwargs)

Implement BondOrderSystem.from that converts from this format to BondOrderSystem.

from_labeled_system(file_name, **kwargs)

Load data for a single LabeledSystem from an LMDB database.

from_multi_systems(file_name[, map_size])

Load multiple systems from a single LMDB database.

from_system(file_name, **kwargs)

Load data for a single System from an LMDB database.

get_formats()

Get all registered formats.

get_from_methods()

Get all registered from methods.

get_to_methods()

Get all registered to methods.

mix_system(*system, type_map, **kwargs)

Mix the systems into mixed_type ones according to the unified given type_map.

post(func_name)

Register a post function for from method.

register(key)

Register a format plugin.

register_from(key)

Register a from method if the target method name is not default.

register_to(key)

Register a to method if the target method name is not default.

to_bond_order_system(data, rdkit_mol, *args, ...)

Implement BondOrderSystem.to that converts from BondOrderSystem to this format.

to_labeled_system(data, file_name, **kwargs)

Save a single LabeledSystem to an LMDB database.

to_multi_systems(formulas, directory[, ...])

Implement MultiSystems.to for LMDB format.

to_system(data, file_name, **kwargs)

Save a single System to an LMDB database.

Examples

Saving a single LabeledSystem

>>> import dpdata
>>> system = dpdata.LabeledSystem("path/to/input.vasp", fmt="vasp/outcar")
>>> system.to("lmdb", "my_single_system.lmdb")

Loading a single LabeledSystem

>>> loaded_system = dpdata.LabeledSystem("my_single_system.lmdb", fmt="lmdb")

Saving multiple systems to a single LMDB database

>>> import dpdata
>>> system_1 = dpdata.LabeledSystem("path/to/system1/OUTCAR", fmt="vasp/outcar")
>>> system_2 = dpdata.LabeledSystem("path/to/system2/OUTCAR", fmt="vasp/outcar")
>>> multi_systems_obj = dpdata.MultiSystems(system_1, system_2)
>>> multi_systems_obj.to("lmdb", "my_multi_system_db.lmdb")

Loading multiple systems from a single LMDB database

>>> import dpdata
>>> loaded_multi_systems = dpdata.MultiSystems.from_file("my_multi_system_db.lmdb", fmt="lmdb")
from_labeled_system(file_name, **kwargs)[source]#

Load data for a single LabeledSystem from an LMDB database.

from_multi_systems(file_name, map_size=1000000000, **kwargs)[source]#

Load multiple systems from a single LMDB database.

Parameters:
file_namestr

The path to the LMDB database directory.

map_sizeint, optional

Maximum size of the LMDB database in bytes.

**kwargsdict

other parameters

Yields:
dict

data dictionary for each system

from_system(file_name, **kwargs)[source]#

Load data for a single System from an LMDB database.

to_labeled_system(data, file_name, **kwargs)[source]#

Save a single LabeledSystem to an LMDB database.

to_multi_systems(formulas, directory, map_size=1000000000, frame_idx_fmt='012d', **kwargs)[source]#

Implement MultiSystems.to for LMDB format.

Parameters:
formulaslist[str]

list of formulas

directorystr

directory of system

map_sizeint, optional

Maximum size of the LMDB database in bytes. Default is 1GB.

frame_idx_fmtstr, optional

The format string used to encode the frame index as a key. Default is “012d”.

**kwargsdict

other parameters

Yields:
tuple

(self, formula) to be used by to_system

to_system(data, file_name, **kwargs)[source]#

Save a single System to an LMDB database.

Submodules#

dpdata.lmdb.format module#

exception dpdata.lmdb.format.LMDBError[source]#

Bases: Exception

Base class for LMDB errors.

class dpdata.lmdb.format.LMDBFormat[source]#

Bases: Format

Class for handling the LMDB format, which stores atomic configurations in a Lightning Memory-Mapped Database (LMDB).

This format is optimized for machine learning workflows where fast, random access to a large number of frames is required. All frames from multiple systems (with potentially different numbers of atoms) are stored in a single LMDB database file.

Both single systems and multiple systems are supported via the standard dpdata APIs.

Methods

MultiModes()

File mode for MultiSystems.

from_bond_order_system(file_name, **kwargs)

Implement BondOrderSystem.from that converts from this format to BondOrderSystem.

from_labeled_system(file_name, **kwargs)

Load data for a single LabeledSystem from an LMDB database.

from_multi_systems(file_name[, map_size])

Load multiple systems from a single LMDB database.

from_system(file_name, **kwargs)

Load data for a single System from an LMDB database.

get_formats()

Get all registered formats.

get_from_methods()

Get all registered from methods.

get_to_methods()

Get all registered to methods.

mix_system(*system, type_map, **kwargs)

Mix the systems into mixed_type ones according to the unified given type_map.

post(func_name)

Register a post function for from method.

register(key)

Register a format plugin.

register_from(key)

Register a from method if the target method name is not default.

register_to(key)

Register a to method if the target method name is not default.

to_bond_order_system(data, rdkit_mol, *args, ...)

Implement BondOrderSystem.to that converts from BondOrderSystem to this format.

to_labeled_system(data, file_name, **kwargs)

Save a single LabeledSystem to an LMDB database.

to_multi_systems(formulas, directory[, ...])

Implement MultiSystems.to for LMDB format.

to_system(data, file_name, **kwargs)

Save a single System to an LMDB database.

Examples

Saving a single LabeledSystem

>>> import dpdata
>>> system = dpdata.LabeledSystem("path/to/input.vasp", fmt="vasp/outcar")
>>> system.to("lmdb", "my_single_system.lmdb")

Loading a single LabeledSystem

>>> loaded_system = dpdata.LabeledSystem("my_single_system.lmdb", fmt="lmdb")

Saving multiple systems to a single LMDB database

>>> import dpdata
>>> system_1 = dpdata.LabeledSystem("path/to/system1/OUTCAR", fmt="vasp/outcar")
>>> system_2 = dpdata.LabeledSystem("path/to/system2/OUTCAR", fmt="vasp/outcar")
>>> multi_systems_obj = dpdata.MultiSystems(system_1, system_2)
>>> multi_systems_obj.to("lmdb", "my_multi_system_db.lmdb")

Loading multiple systems from a single LMDB database

>>> import dpdata
>>> loaded_multi_systems = dpdata.MultiSystems.from_file("my_multi_system_db.lmdb", fmt="lmdb")
from_labeled_system(file_name, **kwargs)[source]#

Load data for a single LabeledSystem from an LMDB database.

from_multi_systems(file_name, map_size=1000000000, **kwargs)[source]#

Load multiple systems from a single LMDB database.

Parameters:
file_namestr

The path to the LMDB database directory.

map_sizeint, optional

Maximum size of the LMDB database in bytes.

**kwargsdict

other parameters

Yields:
dict

data dictionary for each system

from_system(file_name, **kwargs)[source]#

Load data for a single System from an LMDB database.

to_labeled_system(data, file_name, **kwargs)[source]#

Save a single LabeledSystem to an LMDB database.

to_multi_systems(formulas, directory, map_size=1000000000, frame_idx_fmt='012d', **kwargs)[source]#

Implement MultiSystems.to for LMDB format.

Parameters:
formulaslist[str]

list of formulas

directorystr

directory of system

map_sizeint, optional

Maximum size of the LMDB database in bytes. Default is 1GB.

frame_idx_fmtstr, optional

The format string used to encode the frame index as a key. Default is “012d”.

**kwargsdict

other parameters

Yields:
tuple

(self, formula) to be used by to_system

to_system(data, file_name, **kwargs)[source]#

Save a single System to an LMDB database.

exception dpdata.lmdb.format.LMDBFrameError[source]#

Bases: LMDBError

Frame data not found in LMDB.

exception dpdata.lmdb.format.LMDBMetadataError[source]#

Bases: LMDBError

Metadata not found in LMDB.