deepmd.cluster package
Module that reads node resources, auto detects if running local or on SLURM.
- deepmd.cluster.get_resource() Tuple[str, List[str], Optional[List[int]]][source]
Get local or slurm resources: nodename, nodelist, and gpus.
Submodules
deepmd.cluster.local module
Get local GPU resources.
deepmd.cluster.slurm module
MOdule to get resources on SLURM cluster.
References
https://github.com/deepsense-ai/tensorflow_on_slurm ####
- deepmd.cluster.slurm.get_resource() Tuple[str, List[str], Optional[List[int]]][source]
Get SLURM resources: nodename, nodelist, and gpus.
- Returns
- Raises
RuntimeErrorif number of nodes could not be retrieved
ValueErrorlist of nodes is not of the same length sa number of nodes
ValueErrorif current nodename is not found in node list