Supported batch job systems
Batch job system is a system to process batch jobs. One needs to set batch_type
to one of the following values:
Bash
batch_type
: Shell
When batch_type
is set to Shell
, dpdispatcher will generate a bash script to process jobs. No extra packages are required for Shell
.
Due to lack of scheduling system, Shell
runs all jobs at the same time. To avoid running multiple jobs at the same time, one could set group_size
to 0
(means infinity) to generate only one job with multiple tasks.
Slurm
batch_type
: Slurm
, SlurmJobArray
Slurm is a job scheduling system used by lots of HPCs. One needs to make sure slurm has been setup in the remote server and the related environment is activated.
When SlurmJobArray
is used, dpdispatcher submits Slurm jobs with job arrays. In this way, several dpdispatcher task
s map to a Slurm job and a dpdispatcher job
maps to a Slurm job array. Millions of Slurm jobs can be submitted quickly and Slurm can execute all Slurm jobs at the same time. One can use group_size
and slurm_job_size
to control how many Slurm jobs are contained in a Slurm job array.
OpenPBS or PBSPro
batch_type
: PBS
OpenPBS is an open-source job scheduling of the Linux Foundation and PBS Profession is its commercial solution. One needs to make sure OpenPBS has been setup in the remote server and the related environment is activated.
Note that do not use PBS
for Torque.
TORQUE
batch_type
: Torque
The Terascale Open-source Resource and QUEue Manager (TORQUE) is a distributed resource manager based on standard OpenPBS. However, not all OpenPBS flags are still supported in TORQUE. One needs to make sure TORQUE has been setup in the remote server and the related environment is activated.
LSF
batch_type
: LSF
IBM Spectrum LSF Suites is a comprehensive workload management solution used by HPCs. One needs to make sure LSF has been setup in the remote server and the related environment is activated.
Bohrium
batch_type
: Bohrium
Bohrium is the cloud platform for scientific computing. Read Bohrium documentation for details.
DistributedShell
batch_type
: DistributedShell
DistributedShell
is used to submit yarn jobs. Read Support DPDispatcher on Yarn for details.
Fugaku
batch_type
: Fugaku
Fujitsu cloud service is a job scheduling system used by Fujitsu’s HPCs such as Fugaku, ITO and K computer. It should be noted that although the same job scheduling system is used, there are some differences in the details, Fagaku class cannot be directly used for other HPCs.
Read Fujitsu cloud service documentation for details.
OpenAPI
batcy_type
: OpenAPI
OpenAPI is a new way to submit jobs to Bohrium. It using AccessKey instead of username and password. Read Bohrium documentation for details.
SGE
batch_type
: SGE
The Sun Grid Engine (SGE) scheduler is a batch-queueing system distributed resource management. The commands and flags of SGE share a lot similarity with PBS except when checking job status. Use this argument if one is submitting job to SGE based batch system.