Supported batch job systems#
Batch job system is a system to process batch jobs. One needs to set batch_type
to one of the following values:
Bash#
batch_type
: Shell
When batch_type
is set to Shell
, dpdispatcher will generate a bash script to process jobs. No extra packages are required for Shell
.
Due to lack of scheduling system, Shell
runs all jobs at the same time. To avoid running multiple jobs at the same time, one could set group_size
to 0
(means infinity) to generate only one job with multiple tasks.
Slurm#
batch_type
: Slurm
, SlurmJobArray
Slurm is a job scheduling system used by lots of HPCs. One needs to make sure slurm has been set up in the remote server and the related environment is activated.
When SlurmJobArray
is used, dpdispatcher submits Slurm jobs with job arrays. In this way, several dpdispatcher task
s map to a Slurm job and a dpdispatcher job
maps to a Slurm job array. Millions of Slurm jobs can be submitted quickly and Slurm can execute all Slurm jobs at the same time. One can use group_size
and slurm_job_size
to control how many Slurm jobs are contained in a Slurm job array.
OpenPBS or PBSPro#
batch_type
: PBS
OpenPBS is an open-source job scheduling of the Linux Foundation and PBS Profession is its commercial solution. One needs to make sure OpenPBS has been set up in the remote server and the related environment is activated.
Note that do not use PBS
for Torque.
TORQUE#
batch_type
: Torque
The Terascale Open-source Resource and QUEue Manager (TORQUE) is a distributed resource manager based on standard OpenPBS. However, not all OpenPBS flags are still supported in TORQUE. One needs to make sure TORQUE has been set up in the remote server and the related environment is activated.
LSF#
batch_type
: LSF
IBM Spectrum LSF Suites is a comprehensive workload management solution used by HPCs. One needs to make sure LSF has been set up in the remote server and the related environment is activated.
JH UniScheduler#
batch_type
: JH_UniScheduler
JH UniScheduler was developed by JHINNO company and uses “jsub” to submit tasks. Its overall architecture is similar to that of IBM’s LSF. However, there are still some differences between them. One needs to make sure JH UniScheduler has been set up in the remote server and the related environment is activated.
Bohrium#
batch_type
: Bohrium
Bohrium is the cloud platform for scientific computing. Read Bohrium documentation for details.
DistributedShell#
batch_type
: DistributedShell
DistributedShell
is used to submit yarn jobs. Read Support DPDispatcher on Yarn for details.
Fugaku#
batch_type
: Fugaku
Fujitsu cloud service is a job scheduling system used by Fujitsu’s HPCs such as Fugaku, ITO and K computer. It should be noted that although the same job scheduling system is used, there are some differences in the details, Fagaku class cannot be directly used for other HPCs.
Read Fujitsu cloud service documentation for details.
OpenAPI#
batcy_type
: OpenAPI
OpenAPI is a new way to submit jobs to Bohrium. It is using AccessKey instead of username and password. Read Bohrium documentation for details.
SGE#
batch_type
: SGE
The Sun Grid Engine (SGE) scheduler is a batch-queueing system distributed resource management. The commands and flags of SGE share a lot of similarity with PBS except when checking job status. Use this argument if one is submitting job to an SGE-based batch system.