lmdb format#
Class: LMDBFormat
Class for handling the LMDB format, which stores atomic configurations in a Lightning Memory-Mapped Database (LMDB).
This format is optimized for machine learning workflows where fast, random access to a large number of frames is required. All frames from multiple systems (with potentially different numbers of atoms) are stored in a single LMDB database file.
Both single systems and multiple systems are supported via the standard dpdata APIs.
Examples
Saving a single LabeledSystem
>>> import dpdata
>>> system = dpdata.LabeledSystem("path/to/input.vasp", fmt="vasp/outcar")
>>> system.to("lmdb", "my_single_system.lmdb")
Loading a single LabeledSystem
>>> loaded_system = dpdata.LabeledSystem("my_single_system.lmdb", fmt="lmdb")
Saving multiple systems to a single LMDB database
>>> import dpdata
>>> system_1 = dpdata.LabeledSystem("path/to/system1/OUTCAR", fmt="vasp/outcar")
>>> system_2 = dpdata.LabeledSystem("path/to/system2/OUTCAR", fmt="vasp/outcar")
>>> multi_systems_obj = dpdata.MultiSystems(system_1, system_2)
>>> multi_systems_obj.to("lmdb", "my_multi_system_db.lmdb")
Loading multiple systems from a single LMDB database
>>> import dpdata
>>> loaded_multi_systems = dpdata.MultiSystems.from_file("my_multi_system_db.lmdb", fmt="lmdb")
Conversions#
Convert from this format to System#
- dpdata.System(file_name, fmt: Literal['lmdb'] = None) dpdata.system.System
- dpdata.System.from_lmdb(file_name) dpdata.system.System
Load data for a single System from an LMDB database.
- Returns:
- System
converted system
Convert from System to this format#
- dpdata.System.to(fmt: Literal['lmdb'], file_name)
- dpdata.System.to_lmdb(file_name)
Save a single System to an LMDB database.
Convert from LabeledSystem to this format#
- dpdata.LabeledSystem.to(fmt: Literal['lmdb'], file_name)
- dpdata.LabeledSystem.to_lmdb(file_name)
Save a single LabeledSystem to an LMDB database.
Convert from this format to LabeledSystem#
- dpdata.LabeledSystem(file_name, fmt: Literal['lmdb'] = None) dpdata.system.LabeledSystem
- dpdata.LabeledSystem.from_lmdb(file_name) dpdata.system.LabeledSystem
Load data for a single LabeledSystem from an LMDB database.
- Returns:
- LabeledSystem
converted system
Convert from this format to MultiSystems#
- dpdata.MultiSystems.from_lmdb(file_name, map_size=1000000000) dpdata.system.MultiSystems
Load multiple systems from a single LMDB database.
- Parameters:
- file_namestr
The path to the LMDB database directory.
- map_sizeint, optional
Maximum size of the LMDB database in bytes.
- Returns:
- MultiSystems
converted system
Convert from MultiSystems to this format#
- dpdata.MultiSystems.to(fmt: Literal['lmdb'], directory, map_size=1000000000, frame_idx_fmt='012d') dpdata.system.MultiSystems
- dpdata.MultiSystems.to_lmdb(directory, map_size=1000000000, frame_idx_fmt='012d') dpdata.system.MultiSystems
Implement MultiSystems.to for LMDB format.
- Parameters:
- directorystr
directory of system
- map_sizeint, optional
Maximum size of the LMDB database in bytes. Default is 1GB.
- frame_idx_fmtstr, optional
The format string used to encode the frame index as a key. Default is “012d”.
- Returns:
- MultiSystems
this system