Runtime environment variables

12. Runtime environment variables#

Note

For build-time environment variables, see Install from source code.

12.1. All interfaces#

DP_INTER_OP_PARALLELISM_THREADS#

Alias: TF_INTER_OP_PARALLELISM_THREADS Default: 0

Control parallelism within TensorFlow (when TensorFlow is built against Eigen) and PyTorch native OPs for CPU devices. See How to control the parallelism of a job for details.

DP_INTRA_OP_PARALLELISM_THREADS#

Alias: TF_INTRA_OP_PARALLELISM_THREADS** Default: 0

Control parallelism within TensorFlow (when TensorFlow is built against Eigen) and PyTorch native OPs. See How to control the parallelism of a job for details.

12.2. Environment variables of dependencies#

If OpenMP is used, OpenMP environment variables can be used to control OpenMP threads, such as OMP_NUM_THREADS.
If CUDA is used, CUDA environment variables can be used to control CUDA devices, such as CUDA_VISIBLE_DEVICES.
If ROCm is used, ROCm environment variables can be used to control ROCm devices.
If TensorFlow is used, TensorFlow environment variables can be used.
If PyTorch is used, PyTorch environment variables can be used.
JAX_PLATFORMS and XLA_FLAGS are commonly used.

12.3. Python interface only#

DP_INTERFACE_PREC#

Choices: high, low; Default: high

Control high (double) or low (float) precision of training.

DP_AUTO_PARALLELIZATION#

Choices: 0, 1; Default: 0

Enable auto parallelization for CPU operators.

DP_JIT#

Choices: 0, 1; Default: 0

Enable JIT. Note that this option may either improve or decrease the performance. Requires TensorFlow to support JIT.

DP_INFER_BATCH_SIZE#

Default: 1024 on CPUs and as maximum as possible until out-of-memory on GPUs

Inference batch size, calculated by multiplying the number of frames with the number of atoms.

DP_BACKEND#

Default: tensorflow

Default backend.

NUM_WORKERS#

Default: 4 or the number of cores (whichever is smaller)

Number of subprocesses to use for data loading in the PyTorch backend. See PyTorch documentation for details.

12.4. C++ interface only#

These environment variables also apply to third-party programs using the C++ interface, such as LAMMPS.

DP_PLUGIN_PATH#

Type: List of paths, split by : on Unix and ; on Windows

List of customized OP plugin libraries to load, such as /path/to/plugin1.so:/path/to/plugin2.so on Linux and /path/to/plugin1.dll;/path/to/plugin2.dll on Windows.

DP_PROFILER#

Enable the built-in PyTorch Kineto profiler for the PyTorch C++ (inference) backend.

Type: string (output file stem)

Default: unset (disabled)

When set to a non-empty value, profiling is enabled for the lifetime of the loaded PyTorch model (e.g. during LAMMPS runs). A JSON trace file is created on finish. The final file name is constructed as:

<ENV_VALUE>_gpu<ID>.json if running on GPU
<ENV_VALUE>.json if running on CPU

The trace can be examined with Chrome trace viewer (alternatively chrome://tracing). It includes:

CPU operator activities
CUDA activities (if available)

Example:

export DP_PROFILER=result
mpirun -np 4 lmp -in in.lammps
# Produces result_gpuX.json, where X is the GPU id used by each MPI rank.

Tips:

Large runs can generate sizable JSON files; consider limiting numbers of MD steps, like 20.
Currently this feature only supports single process, or multi-process runs where each process uses a distinct GPU on the same node.