In this example we will convert the DFT labeled data stored in VASP OUTCAR format into the data format used by DeePMD-kit. The example OUTCAR can be found in the directory.



The DeePMD-kit organize data in systems. Each system is composed by a number of frames. One may roughly view a frame as a snap short on an MD trajectory, but it does not necessary come from an MD simulation. A frame records the coordinates and types of atoms, cell vectors if the periodic boundary condition is assumed, energy, atomic forces and virial. It is noted that the frames in one system share the same number of atoms with the same type.

Data conversion

It is conveninent to use dpdata to convert data generated by DFT packages to the data format used by DeePMD-kit.

To install one can execute

pip install dpdata

An example of converting data VASP data in OUTCAR format to DeePMD-kit data can be found at


Switch to that directory, then one can convert data by using the following python script

import dpdata
dsys = dpdata.LabeledSystem('OUTCAR')'deepmd/npy', 'deepmd_data', set_size = dsys.get_nframes())

get_nframes() method gets the number of frames in the OUTCAR, and the argument set_size enforces that the set size is equal to the number of frames in the system, viz. only one set is created in the system.

The data in DeePMD-kit format is stored in the folder deepmd_data.

A list of all supported data format and more nice features of dpdata can be found at the official website.