lmdb format#

Class: LMDBFormat

Class for handling the LMDB format, which stores atomic configurations in a Lightning Memory-Mapped Database (LMDB).

This format is optimized for machine learning workflows where fast, random access to a large number of frames is required. All frames from multiple systems (with potentially different numbers of atoms) are stored in a single LMDB database file.

Both single systems and multiple systems are supported via the standard dpdata APIs.

Examples

Saving a single LabeledSystem

>>> import dpdata
>>> system = dpdata.LabeledSystem("path/to/input.vasp", fmt="vasp/outcar")
>>> system.to("lmdb", "my_single_system.lmdb")

Loading a single LabeledSystem

>>> loaded_system = dpdata.LabeledSystem("my_single_system.lmdb", fmt="lmdb")

Saving multiple systems to a single LMDB database

>>> import dpdata
>>> system_1 = dpdata.LabeledSystem("path/to/system1/OUTCAR", fmt="vasp/outcar")
>>> system_2 = dpdata.LabeledSystem("path/to/system2/OUTCAR", fmt="vasp/outcar")
>>> multi_systems_obj = dpdata.MultiSystems(system_1, system_2)
>>> multi_systems_obj.to("lmdb", "my_multi_system_db.lmdb")

Loading multiple systems from a single LMDB database

>>> import dpdata
>>> loaded_multi_systems = dpdata.MultiSystems.from_file("my_multi_system_db.lmdb", fmt="lmdb")

Conversions#

Convert from this format to System#

dpdata.System(file_name, fmt: Literal['lmdb'] = None) dpdata.system.System
dpdata.System.from_lmdb(file_name) dpdata.system.System

Load data for a single System from an LMDB database.

Returns:
System

converted system

Convert from System to this format#

dpdata.System.to(fmt: Literal['lmdb'], file_name)
dpdata.System.to_lmdb(file_name)

Save a single System to an LMDB database.

Convert from LabeledSystem to this format#

dpdata.LabeledSystem.to(fmt: Literal['lmdb'], file_name)
dpdata.LabeledSystem.to_lmdb(file_name)

Save a single LabeledSystem to an LMDB database.

Convert from this format to LabeledSystem#

dpdata.LabeledSystem(file_name, fmt: Literal['lmdb'] = None) dpdata.system.LabeledSystem
dpdata.LabeledSystem.from_lmdb(file_name) dpdata.system.LabeledSystem

Load data for a single LabeledSystem from an LMDB database.

Returns:
LabeledSystem

converted system

Convert from this format to MultiSystems#

dpdata.MultiSystems.from_lmdb(file_name, map_size=1000000000) dpdata.system.MultiSystems

Load multiple systems from a single LMDB database.

Parameters:
file_namestr

The path to the LMDB database directory.

map_sizeint, optional

Maximum size of the LMDB database in bytes.

Returns:
MultiSystems

converted system

Convert from MultiSystems to this format#

dpdata.MultiSystems.to(fmt: Literal['lmdb'], directory, map_size=1000000000, frame_idx_fmt='012d') dpdata.system.MultiSystems
dpdata.MultiSystems.to_lmdb(directory, map_size=1000000000, frame_idx_fmt='012d') dpdata.system.MultiSystems

Implement MultiSystems.to for LMDB format.

Parameters:
directorystr

directory of system

map_sizeint, optional

Maximum size of the LMDB database in bytes. Default is 1GB.

frame_idx_fmtstr, optional

The format string used to encode the frame index as a key. Default is “012d”.

Returns:
MultiSystems

this system