dpgen run machine parameters#

Note

One can load, modify, and export the input file by using our effective web-based tool DP-GUI online or hosted using the command line interface dpgen gui. All parameters below can be set in DP-GUI. By clicking “SAVE JSON”, one can download the input file.

run_mdata:#

type: dict
argument path: run_mdata
machine.json file
api_version:#
type: str, optional, default: 1.0
argument path: run_mdata/api_version
Please set to 1.0
deepmd_version:#
type: str, optional, default: 2
argument path: run_mdata/deepmd_version
DeePMD-kit version, e.g. 2.1.3
train:#
type: dict
argument path: run_mdata/train
Parameters of command, machine, and resources for train
command:#
type: str
argument path: run_mdata/train/command
Command of a program.
machine:#
type: dict
argument path: run_mdata/train/machine
batch_type:#
type: str
argument path: run_mdata/train/machine/batch_type
Batch backend used to execute jobs. Option: Slurm, DistributedShell, Shell, Fugaku, SGE, Bohrium, PBS, LSF, OpenAPI, Torque, JH_UniScheduler, SlurmJobArray
local_root:#
type: str | NoneType
argument path: run_mdata/train/machine/local_root
Local project root used by DPDispatcher to find task directories and local files. If submission.work_base is a relative path, it is resolved inside this directory; if submission.work_base is absolute, it is used as-is and local_root is ignored.
remote_root:#
type: str | NoneType, optional
argument path: run_mdata/train/machine/remote_root
Remote root directory used by non-local contexts such as SSH. DPDispatcher creates and uses a submission-specific working directory beneath this root on the remote side. For SSHContext, this path should be absolute.
clean_asynchronously:#
type: bool, optional, default: False
argument path: run_mdata/train/machine/clean_asynchronously
Clean the remote working directory asynchronously after the job finishes. Avoid enabling this while debugging, because it can remove remote artifacts before you inspect them.
retry_count:#
type: int, optional, default: 3
argument path: run_mdata/train/machine/retry_count
How many times DPDispatcher will retry a failed job before raising an error.
Depending on the value of context_type, different sub args are accepted.
context_type:#
type: str (flag key)
argument path: run_mdata/train/machine/context_type
possible choices: HDFSContext, LazyLocalContext, SSHContext, OpenAPIContext, LocalContext, BohriumContext
Execution context / connection type used to reach the execution environment. Option: OpenAPIContext, BohriumContext, LazyLocalContext, SSHContext, LocalContext, HDFSContext
When context_type is set to HDFSContext (or its aliases hdfscontext, HDFS, hdfs):
remote_profile:#
type: dict, optional
argument path: run_mdata/train/machine[HDFSContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to LazyLocalContext (or its aliases lazylocalcontext, LazyLocal, lazylocal):
remote_profile:#
type: dict, optional
argument path: run_mdata/train/machine[LazyLocalContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to SSHContext (or its aliases sshcontext, SSH, ssh):
remote_profile:#
type: dict
argument path: run_mdata/train/machine[SSHContext]/remote_profile
SSH connection settings for the remote machine, including authentication, timeouts, and optional proxy/jump-host behavior.
hostname:#
type: str
argument path: run_mdata/train/machine[SSHContext]/remote_profile/hostname
Hostname or IP address of the SSH target machine.
username:#
type: str
argument path: run_mdata/train/machine[SSHContext]/remote_profile/username
Username used to log in to the target system.
password:#
type: str, optional
argument path: run_mdata/train/machine[SSHContext]/remote_profile/password
(deprecated) password of linux system. Please use SSH keys instead to improve security.
port:#
type: int, optional, default: 22
argument path: run_mdata/train/machine[SSHContext]/remote_profile/port
SSH port of the target machine. Usually 22.
key_filename:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/machine[SSHContext]/remote_profile/key_filename
Path to the private key file used for SSH authentication. If left None, DPDispatcher can try discoverable keys in ~/.ssh or fall back to password-based login if configured.
passphrase:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/machine[SSHContext]/remote_profile/passphrase
Passphrase for the SSH private key, if the key is encrypted.
timeout:#
type: int, optional, default: 10
argument path: run_mdata/train/machine[SSHContext]/remote_profile/timeout
Timeout in seconds for establishing the SSH connection.
totp_secret:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/machine[SSHContext]/remote_profile/totp_secret
Time-based one-time-password secret used for keyboard-interactive 2FA. It should be a base32-encoded string.
tar_compress:#
type: bool, optional, default: True
argument path: run_mdata/train/machine[SSHContext]/remote_profile/tar_compress
Whether upload/download tar archives are compressed. Keeping this True usually reduces transfer size at the cost of extra CPU time.
look_for_keys:#
type: bool, optional, default: True
argument path: run_mdata/train/machine[SSHContext]/remote_profile/look_for_keys
Whether to search for discoverable private key files in ~/.ssh when key_filename is not provided.
execute_command:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/machine[SSHContext]/remote_profile/execute_command
Optional command executed immediately after the SSH connection is established.
proxy_command:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/machine[SSHContext]/remote_profile/proxy_command
Optional SSH ProxyCommand used to reach the target through an intermediate host or tunnel.
When context_type is set to OpenAPIContext (or its aliases openapicontext, OpenAPI, openapi):
remote_profile:#
type: dict, optional
argument path: run_mdata/train/machine[OpenAPIContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to LocalContext (or its aliases localcontext, Local, local):
remote_profile:#
type: dict, optional
argument path: run_mdata/train/machine[LocalContext]/remote_profile
Options controlling how files are staged between local_root and remote_root when both paths are on the local filesystem.
symlink:#
type: bool, optional, default: True
argument path: run_mdata/train/machine[LocalContext]/remote_profile/symlink
Whether to use symbolic links instead of copying files from local_root into remote_root. Disable this when the execution side cannot access the original local path through the same filesystem view.
When context_type is set to BohriumContext (or its aliases bohriumcontext, Bohrium, bohrium, DpCloudServerContext, dpcloudservercontext, DpCloudServer, dpcloudserver, LebesgueContext, lebesguecontext, Lebesgue, lebesgue):
remote_profile:#
type: dict
argument path: run_mdata/train/machine[BohriumContext]/remote_profile
Configuration for Bohrium submission, including login credentials, project selection, and job-handling behavior.
email:#
type: str, optional
argument path: run_mdata/train/machine[BohriumContext]/remote_profile/email
Email address used to log in to Bohrium.
password:#
type: str, optional
argument path: run_mdata/train/machine[BohriumContext]/remote_profile/password
Password used together with email or phone login. If BOHR_TICKET is set, password-based login can be skipped.
phone:#
type: str, optional
argument path: run_mdata/train/machine[BohriumContext]/remote_profile/phone
Phone number used to log in when email is not used.
program_id:#
type: int, alias: project_id
argument path: run_mdata/train/machine[BohriumContext]/remote_profile/program_id
Program / project ID used to place uploaded jobs under the correct Bohrium project namespace.
retry_count:#
type: NoneType | int, optional, default: 2
argument path: run_mdata/train/machine[BohriumContext]/remote_profile/retry_count
How many times a terminated remote job is retried on the platform side before giving up.
ignore_exit_code:#
type: bool, optional, default: True
argument path: run_mdata/train/machine[BohriumContext]/remote_profile/ignore_exit_code
Whether a non-zero exit code from the remote platform is still treated as finished. If False, such jobs are marked as terminated.
keep_backup:#
type: bool, optional
argument path: run_mdata/train/machine[BohriumContext]/remote_profile/keep_backup
Whether to keep uploaded/downloaded zip archives in the local backup directory after transfer.
input_data:#
type: dict
argument path: run_mdata/train/machine[BohriumContext]/remote_profile/input_data
Platform-specific job configuration passed through to the Bohrium API.
resources:#
type: dict
argument path: run_mdata/train/resources
number_node:#
type: int, optional, default: 1
argument path: run_mdata/train/resources/number_node
Number of nodes requested for each scheduler job generated by DPDispatcher.
cpu_per_node:#
type: int, optional, default: 1
argument path: run_mdata/train/resources/cpu_per_node
Number of CPUs requested on each node for each scheduler job.
gpu_per_node:#
type: int, optional, default: 0
argument path: run_mdata/train/resources/gpu_per_node
Number of GPUs requested on each node for each scheduler job.
queue_name:#
type: str, optional, default: (empty string)
argument path: run_mdata/train/resources/queue_name
Queue or partition name used by the selected batch system. For local Shell runs this is usually an empty string; for Slurm it typically maps to a partition.
group_size:#
type: int
argument path: run_mdata/train/resources/group_size
How many tasks are packed into one scheduler job. For example, 20 tasks with group_size=5 are typically split into 4 jobs. Use 1 for the simplest one-task workflow. 0 means no explicit upper limit in the grouping logic.
custom_flags:#
type: typing.List[str], optional
argument path: run_mdata/train/resources/custom_flags
Extra scheduler-header lines inserted into the generated submission script, typically for backend-specific options that are not covered by the standard fields.
strategy:#
type: dict, optional
argument path: run_mdata/train/resources/strategy
Strategy options that affect how DPDispatcher generates and evaluates submission scripts.
if_cuda_multi_devices:#
type: bool, optional, default: False
argument path: run_mdata/train/resources/strategy/if_cuda_multi_devices
If a node has multiple NVIDIA GPUs, assign different tasks inside the same job to different GPUs by setting CUDA_VISIBLE_DEVICES automatically. Usually used together with para_deg > 1 and task-level resource awareness.
ratio_unfinished:#
type: float, optional, default: 0.0
argument path: run_mdata/train/resources/strategy/ratio_unfinished
Maximum fraction of tasks allowed to remain unfinished when evaluating job completion. Use 0.0 for the strict default that requires every task to finish.
customized_script_header_template_file:#
type: str, optional
argument path: run_mdata/train/resources/strategy/customized_script_header_template_file
Custom template file for the scheduler-header portion of generated submission scripts. Overrides the default template.
para_deg:#
type: int, optional, default: 1
argument path: run_mdata/train/resources/para_deg
How many tasks inside one generated job are run in parallel. This is different from group_size: group_size controls how many tasks are bundled into a job, while para_deg controls concurrency within that job. Keep para_deg=1 for the safest default.
source_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/train/resources/source_list
Shell scripts or environment files sourced before task commands run. Useful on HPC systems for activating software stacks explicitly instead of relying on login-shell defaults.
module_purge:#
type: bool, optional, default: False
argument path: run_mdata/train/resources/module_purge
Whether to run ‘module purge’ before applying module_unload_list and module_list. Mainly useful on HPC systems.
module_unload_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/train/resources/module_unload_list
Modules to unload before loading the requested modules. Mainly relevant on HPC systems with environment modules.
module_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/train/resources/module_list
Modules to load before executing tasks. Mainly relevant on HPC systems with environment modules.
envs:#
type: dict, optional, default: {}
argument path: run_mdata/train/resources/envs
Environment variables exported before executing tasks.
prepend_script:#
type: typing.List[str], optional, default: []
argument path: run_mdata/train/resources/prepend_script
Optional shell lines inserted before task commands in the generated job script.
append_script:#
type: typing.List[str], optional, default: []
argument path: run_mdata/train/resources/append_script
Optional shell lines inserted after task commands in the generated job script.
wait_time:#
type: float | int, optional, default: 0
argument path: run_mdata/train/resources/wait_time
Delay in seconds inserted after a job is submitted or resubmitted. Usually keep 0 unless the scheduler/site asks you to throttle submission pace.
Depending on the value of batch_type, different sub args are accepted.
batch_type:#
type: str (flag key)
argument path: run_mdata/train/resources/batch_type
possible choices: Torque, Fugaku, Bohrium, Shell, OpenAPI, SlurmJobArray, SGE, Slurm, LSF, JH_UniScheduler, DistributedShell, PBS
The batch job system type loaded from machine/batch_type.
When batch_type is set to Torque (or its alias torque):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[Torque]/kwargs
This field is empty for this batch.
When batch_type is set to Fugaku (or its alias fugaku):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[Fugaku]/kwargs
This field is empty for this batch.
When batch_type is set to Bohrium (or its aliases bohrium, Lebesgue, lebesgue, DpCloudServer, dpcloudserver):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[Bohrium]/kwargs
This field is empty for this batch.
When batch_type is set to Shell (or its alias shell):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[Shell]/kwargs
This field is empty for this batch.
When batch_type is set to OpenAPI (or its alias openapi):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[OpenAPI]/kwargs
This field is empty for this batch.
When batch_type is set to SlurmJobArray (or its alias slurmjobarray):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[SlurmJobArray]/kwargs
Slurm-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/resources[SlurmJobArray]/kwargs/custom_gpu_line
Custom GPU header line starting with #SBATCH. When set, it overrides DPDispatcher’s default Slurm GPU line generated from gpu_per_node.
slurm_job_size:#
type: int, optional, default: 1
argument path: run_mdata/train/resources[SlurmJobArray]/kwargs/slurm_job_size
For SlurmJobArray, how many DPDispatcher tasks are grouped into one array element / Slurm job script branch.
When batch_type is set to SGE (or its alias sge):
kwargs:#
type: dict
argument path: run_mdata/train/resources[SGE]/kwargs
SGE-specific extra arguments.
pe_name:#
type: str, optional, default: mpi, alias: sge_pe_name
argument path: run_mdata/train/resources[SGE]/kwargs/pe_name
Parallel environment name used by SGE, for example mpi. This controls the #$ -pe … header line in SGE mode.
job_name:#
type: str, optional, default: wDPjob
argument path: run_mdata/train/resources[SGE]/kwargs/job_name
Job name shown by SGE for this submission.
When batch_type is set to Slurm (or its alias slurm):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[Slurm]/kwargs
Slurm-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/resources[Slurm]/kwargs/custom_gpu_line
Custom GPU header line starting with #SBATCH. When set, it overrides DPDispatcher’s default Slurm GPU line generated from gpu_per_node.
When batch_type is set to LSF (or its alias lsf):
kwargs:#
type: dict
argument path: run_mdata/train/resources[LSF]/kwargs
LSF-specific extra arguments.
gpu_usage:#
type: bool, optional, default: False
argument path: run_mdata/train/resources[LSF]/kwargs/gpu_usage
Whether DPDispatcher should emit an LSF GPU request line at all. If False, no GPU request header is added.
gpu_new_syntax:#
type: bool, optional, default: False
argument path: run_mdata/train/resources[LSF]/kwargs/gpu_new_syntax
Whether to use the newer #BSUB -gpu syntax instead of the older resource string syntax. This is typically used on newer LSF versions.
gpu_exclusive:#
type: bool, optional, default: True
argument path: run_mdata/train/resources[LSF]/kwargs/gpu_exclusive
Only meaningful when gpu_new_syntax is enabled. Controls whether the submitted job requests GPUs in exclusive mode.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/resources[LSF]/kwargs/custom_gpu_line
Custom GPU header line starting with #BSUB. When set, it overrides the GPU-related LSF header generated from the other GPU kwargs.
When batch_type is set to JH_UniScheduler (or its alias jh_unischeduler):
kwargs:#
type: dict
argument path: run_mdata/train/resources[JH_UniScheduler]/kwargs
JH_UniScheduler-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/train/resources[JH_UniScheduler]/kwargs/custom_gpu_line
Custom GPU header line starting with #JSUB. When set, it overrides the default UniScheduler GPU line generated from gpu_per_node.
When batch_type is set to DistributedShell (or its alias distributedshell):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[DistributedShell]/kwargs
This field is empty for this batch.
When batch_type is set to PBS (or its alias pbs):
kwargs:#
type: dict, optional
argument path: run_mdata/train/resources[PBS]/kwargs
This field is empty for this batch.
user_forward_files:#
type: list, optional
argument path: run_mdata/train/user_forward_files
Files to be forwarded to the remote machine.
user_backward_files:#
type: list, optional
argument path: run_mdata/train/user_backward_files
Files to be backwarded from the remote machine.
model_devi:#
type: dict
argument path: run_mdata/model_devi
Parameters of command, machine, and resources for model_devi
command:#
type: str
argument path: run_mdata/model_devi/command
Command of a program.
machine:#
type: dict
argument path: run_mdata/model_devi/machine
batch_type:#
type: str
argument path: run_mdata/model_devi/machine/batch_type
Batch backend used to execute jobs. Option: Slurm, DistributedShell, Shell, Fugaku, SGE, Bohrium, PBS, LSF, OpenAPI, Torque, JH_UniScheduler, SlurmJobArray
local_root:#
type: str | NoneType
argument path: run_mdata/model_devi/machine/local_root
Local project root used by DPDispatcher to find task directories and local files. If submission.work_base is a relative path, it is resolved inside this directory; if submission.work_base is absolute, it is used as-is and local_root is ignored.
remote_root:#
type: str | NoneType, optional
argument path: run_mdata/model_devi/machine/remote_root
Remote root directory used by non-local contexts such as SSH. DPDispatcher creates and uses a submission-specific working directory beneath this root on the remote side. For SSHContext, this path should be absolute.
clean_asynchronously:#
type: bool, optional, default: False
argument path: run_mdata/model_devi/machine/clean_asynchronously
Clean the remote working directory asynchronously after the job finishes. Avoid enabling this while debugging, because it can remove remote artifacts before you inspect them.
retry_count:#
type: int, optional, default: 3
argument path: run_mdata/model_devi/machine/retry_count
How many times DPDispatcher will retry a failed job before raising an error.
Depending on the value of context_type, different sub args are accepted.
context_type:#
type: str (flag key)
argument path: run_mdata/model_devi/machine/context_type
possible choices: HDFSContext, LazyLocalContext, SSHContext, OpenAPIContext, LocalContext, BohriumContext
Execution context / connection type used to reach the execution environment. Option: OpenAPIContext, BohriumContext, LazyLocalContext, SSHContext, LocalContext, HDFSContext
When context_type is set to HDFSContext (or its aliases hdfscontext, HDFS, hdfs):
remote_profile:#
type: dict, optional
argument path: run_mdata/model_devi/machine[HDFSContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to LazyLocalContext (or its aliases lazylocalcontext, LazyLocal, lazylocal):
remote_profile:#
type: dict, optional
argument path: run_mdata/model_devi/machine[LazyLocalContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to SSHContext (or its aliases sshcontext, SSH, ssh):
remote_profile:#
type: dict
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile
SSH connection settings for the remote machine, including authentication, timeouts, and optional proxy/jump-host behavior.
hostname:#
type: str
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/hostname
Hostname or IP address of the SSH target machine.
username:#
type: str
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/username
Username used to log in to the target system.
password:#
type: str, optional
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/password
(deprecated) password of linux system. Please use SSH keys instead to improve security.
port:#
type: int, optional, default: 22
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/port
SSH port of the target machine. Usually 22.
key_filename:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/key_filename
Path to the private key file used for SSH authentication. If left None, DPDispatcher can try discoverable keys in ~/.ssh or fall back to password-based login if configured.
passphrase:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/passphrase
Passphrase for the SSH private key, if the key is encrypted.
timeout:#
type: int, optional, default: 10
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/timeout
Timeout in seconds for establishing the SSH connection.
totp_secret:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/totp_secret
Time-based one-time-password secret used for keyboard-interactive 2FA. It should be a base32-encoded string.
tar_compress:#
type: bool, optional, default: True
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/tar_compress
Whether upload/download tar archives are compressed. Keeping this True usually reduces transfer size at the cost of extra CPU time.
look_for_keys:#
type: bool, optional, default: True
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/look_for_keys
Whether to search for discoverable private key files in ~/.ssh when key_filename is not provided.
execute_command:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/execute_command
Optional command executed immediately after the SSH connection is established.
proxy_command:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/machine[SSHContext]/remote_profile/proxy_command
Optional SSH ProxyCommand used to reach the target through an intermediate host or tunnel.
When context_type is set to OpenAPIContext (or its aliases openapicontext, OpenAPI, openapi):
remote_profile:#
type: dict, optional
argument path: run_mdata/model_devi/machine[OpenAPIContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to LocalContext (or its aliases localcontext, Local, local):
remote_profile:#
type: dict, optional
argument path: run_mdata/model_devi/machine[LocalContext]/remote_profile
Options controlling how files are staged between local_root and remote_root when both paths are on the local filesystem.
symlink:#
type: bool, optional, default: True
argument path: run_mdata/model_devi/machine[LocalContext]/remote_profile/symlink
Whether to use symbolic links instead of copying files from local_root into remote_root. Disable this when the execution side cannot access the original local path through the same filesystem view.
When context_type is set to BohriumContext (or its aliases bohriumcontext, Bohrium, bohrium, DpCloudServerContext, dpcloudservercontext, DpCloudServer, dpcloudserver, LebesgueContext, lebesguecontext, Lebesgue, lebesgue):
remote_profile:#
type: dict
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile
Configuration for Bohrium submission, including login credentials, project selection, and job-handling behavior.
email:#
type: str, optional
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile/email
Email address used to log in to Bohrium.
password:#
type: str, optional
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile/password
Password used together with email or phone login. If BOHR_TICKET is set, password-based login can be skipped.
phone:#
type: str, optional
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile/phone
Phone number used to log in when email is not used.
program_id:#
type: int, alias: project_id
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile/program_id
Program / project ID used to place uploaded jobs under the correct Bohrium project namespace.
retry_count:#
type: NoneType | int, optional, default: 2
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile/retry_count
How many times a terminated remote job is retried on the platform side before giving up.
ignore_exit_code:#
type: bool, optional, default: True
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile/ignore_exit_code
Whether a non-zero exit code from the remote platform is still treated as finished. If False, such jobs are marked as terminated.
keep_backup:#
type: bool, optional
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile/keep_backup
Whether to keep uploaded/downloaded zip archives in the local backup directory after transfer.
input_data:#
type: dict
argument path: run_mdata/model_devi/machine[BohriumContext]/remote_profile/input_data
Platform-specific job configuration passed through to the Bohrium API.
resources:#
type: dict
argument path: run_mdata/model_devi/resources
number_node:#
type: int, optional, default: 1
argument path: run_mdata/model_devi/resources/number_node
Number of nodes requested for each scheduler job generated by DPDispatcher.
cpu_per_node:#
type: int, optional, default: 1
argument path: run_mdata/model_devi/resources/cpu_per_node
Number of CPUs requested on each node for each scheduler job.
gpu_per_node:#
type: int, optional, default: 0
argument path: run_mdata/model_devi/resources/gpu_per_node
Number of GPUs requested on each node for each scheduler job.
queue_name:#
type: str, optional, default: (empty string)
argument path: run_mdata/model_devi/resources/queue_name
Queue or partition name used by the selected batch system. For local Shell runs this is usually an empty string; for Slurm it typically maps to a partition.
group_size:#
type: int
argument path: run_mdata/model_devi/resources/group_size
How many tasks are packed into one scheduler job. For example, 20 tasks with group_size=5 are typically split into 4 jobs. Use 1 for the simplest one-task workflow. 0 means no explicit upper limit in the grouping logic.
custom_flags:#
type: typing.List[str], optional
argument path: run_mdata/model_devi/resources/custom_flags
Extra scheduler-header lines inserted into the generated submission script, typically for backend-specific options that are not covered by the standard fields.
strategy:#
type: dict, optional
argument path: run_mdata/model_devi/resources/strategy
Strategy options that affect how DPDispatcher generates and evaluates submission scripts.
if_cuda_multi_devices:#
type: bool, optional, default: False
argument path: run_mdata/model_devi/resources/strategy/if_cuda_multi_devices
If a node has multiple NVIDIA GPUs, assign different tasks inside the same job to different GPUs by setting CUDA_VISIBLE_DEVICES automatically. Usually used together with para_deg > 1 and task-level resource awareness.
ratio_unfinished:#
type: float, optional, default: 0.0
argument path: run_mdata/model_devi/resources/strategy/ratio_unfinished
Maximum fraction of tasks allowed to remain unfinished when evaluating job completion. Use 0.0 for the strict default that requires every task to finish.
customized_script_header_template_file:#
type: str, optional
argument path: run_mdata/model_devi/resources/strategy/customized_script_header_template_file
Custom template file for the scheduler-header portion of generated submission scripts. Overrides the default template.
para_deg:#
type: int, optional, default: 1
argument path: run_mdata/model_devi/resources/para_deg
How many tasks inside one generated job are run in parallel. This is different from group_size: group_size controls how many tasks are bundled into a job, while para_deg controls concurrency within that job. Keep para_deg=1 for the safest default.
source_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/model_devi/resources/source_list
Shell scripts or environment files sourced before task commands run. Useful on HPC systems for activating software stacks explicitly instead of relying on login-shell defaults.
module_purge:#
type: bool, optional, default: False
argument path: run_mdata/model_devi/resources/module_purge
Whether to run ‘module purge’ before applying module_unload_list and module_list. Mainly useful on HPC systems.
module_unload_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/model_devi/resources/module_unload_list
Modules to unload before loading the requested modules. Mainly relevant on HPC systems with environment modules.
module_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/model_devi/resources/module_list
Modules to load before executing tasks. Mainly relevant on HPC systems with environment modules.
envs:#
type: dict, optional, default: {}
argument path: run_mdata/model_devi/resources/envs
Environment variables exported before executing tasks.
prepend_script:#
type: typing.List[str], optional, default: []
argument path: run_mdata/model_devi/resources/prepend_script
Optional shell lines inserted before task commands in the generated job script.
append_script:#
type: typing.List[str], optional, default: []
argument path: run_mdata/model_devi/resources/append_script
Optional shell lines inserted after task commands in the generated job script.
wait_time:#
type: float | int, optional, default: 0
argument path: run_mdata/model_devi/resources/wait_time
Delay in seconds inserted after a job is submitted or resubmitted. Usually keep 0 unless the scheduler/site asks you to throttle submission pace.
Depending on the value of batch_type, different sub args are accepted.
batch_type:#
type: str (flag key)
argument path: run_mdata/model_devi/resources/batch_type
possible choices: Torque, Fugaku, Bohrium, Shell, OpenAPI, SlurmJobArray, SGE, Slurm, LSF, JH_UniScheduler, DistributedShell, PBS
The batch job system type loaded from machine/batch_type.
When batch_type is set to Torque (or its alias torque):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[Torque]/kwargs
This field is empty for this batch.
When batch_type is set to Fugaku (or its alias fugaku):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[Fugaku]/kwargs
This field is empty for this batch.
When batch_type is set to Bohrium (or its aliases bohrium, Lebesgue, lebesgue, DpCloudServer, dpcloudserver):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[Bohrium]/kwargs
This field is empty for this batch.
When batch_type is set to Shell (or its alias shell):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[Shell]/kwargs
This field is empty for this batch.
When batch_type is set to OpenAPI (or its alias openapi):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[OpenAPI]/kwargs
This field is empty for this batch.
When batch_type is set to SlurmJobArray (or its alias slurmjobarray):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[SlurmJobArray]/kwargs
Slurm-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/resources[SlurmJobArray]/kwargs/custom_gpu_line
Custom GPU header line starting with #SBATCH. When set, it overrides DPDispatcher’s default Slurm GPU line generated from gpu_per_node.
slurm_job_size:#
type: int, optional, default: 1
argument path: run_mdata/model_devi/resources[SlurmJobArray]/kwargs/slurm_job_size
For SlurmJobArray, how many DPDispatcher tasks are grouped into one array element / Slurm job script branch.
When batch_type is set to SGE (or its alias sge):
kwargs:#
type: dict
argument path: run_mdata/model_devi/resources[SGE]/kwargs
SGE-specific extra arguments.
pe_name:#
type: str, optional, default: mpi, alias: sge_pe_name
argument path: run_mdata/model_devi/resources[SGE]/kwargs/pe_name
Parallel environment name used by SGE, for example mpi. This controls the #$ -pe … header line in SGE mode.
job_name:#
type: str, optional, default: wDPjob
argument path: run_mdata/model_devi/resources[SGE]/kwargs/job_name
Job name shown by SGE for this submission.
When batch_type is set to Slurm (or its alias slurm):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[Slurm]/kwargs
Slurm-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/resources[Slurm]/kwargs/custom_gpu_line
Custom GPU header line starting with #SBATCH. When set, it overrides DPDispatcher’s default Slurm GPU line generated from gpu_per_node.
When batch_type is set to LSF (or its alias lsf):
kwargs:#
type: dict
argument path: run_mdata/model_devi/resources[LSF]/kwargs
LSF-specific extra arguments.
gpu_usage:#
type: bool, optional, default: False
argument path: run_mdata/model_devi/resources[LSF]/kwargs/gpu_usage
Whether DPDispatcher should emit an LSF GPU request line at all. If False, no GPU request header is added.
gpu_new_syntax:#
type: bool, optional, default: False
argument path: run_mdata/model_devi/resources[LSF]/kwargs/gpu_new_syntax
Whether to use the newer #BSUB -gpu syntax instead of the older resource string syntax. This is typically used on newer LSF versions.
gpu_exclusive:#
type: bool, optional, default: True
argument path: run_mdata/model_devi/resources[LSF]/kwargs/gpu_exclusive
Only meaningful when gpu_new_syntax is enabled. Controls whether the submitted job requests GPUs in exclusive mode.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/resources[LSF]/kwargs/custom_gpu_line
Custom GPU header line starting with #BSUB. When set, it overrides the GPU-related LSF header generated from the other GPU kwargs.
When batch_type is set to JH_UniScheduler (or its alias jh_unischeduler):
kwargs:#
type: dict
argument path: run_mdata/model_devi/resources[JH_UniScheduler]/kwargs
JH_UniScheduler-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/model_devi/resources[JH_UniScheduler]/kwargs/custom_gpu_line
Custom GPU header line starting with #JSUB. When set, it overrides the default UniScheduler GPU line generated from gpu_per_node.
When batch_type is set to DistributedShell (or its alias distributedshell):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[DistributedShell]/kwargs
This field is empty for this batch.
When batch_type is set to PBS (or its alias pbs):
kwargs:#
type: dict, optional
argument path: run_mdata/model_devi/resources[PBS]/kwargs
This field is empty for this batch.
user_forward_files:#
type: list, optional
argument path: run_mdata/model_devi/user_forward_files
Files to be forwarded to the remote machine.
user_backward_files:#
type: list, optional
argument path: run_mdata/model_devi/user_backward_files
Files to be backwarded from the remote machine.
fp:#
type: dict
argument path: run_mdata/fp
Parameters of command, machine, and resources for fp
command:#
type: str
argument path: run_mdata/fp/command
Command of a program.
machine:#
type: dict
argument path: run_mdata/fp/machine
batch_type:#
type: str
argument path: run_mdata/fp/machine/batch_type
Batch backend used to execute jobs. Option: Slurm, DistributedShell, Shell, Fugaku, SGE, Bohrium, PBS, LSF, OpenAPI, Torque, JH_UniScheduler, SlurmJobArray
local_root:#
type: str | NoneType
argument path: run_mdata/fp/machine/local_root
Local project root used by DPDispatcher to find task directories and local files. If submission.work_base is a relative path, it is resolved inside this directory; if submission.work_base is absolute, it is used as-is and local_root is ignored.
remote_root:#
type: str | NoneType, optional
argument path: run_mdata/fp/machine/remote_root
Remote root directory used by non-local contexts such as SSH. DPDispatcher creates and uses a submission-specific working directory beneath this root on the remote side. For SSHContext, this path should be absolute.
clean_asynchronously:#
type: bool, optional, default: False
argument path: run_mdata/fp/machine/clean_asynchronously
Clean the remote working directory asynchronously after the job finishes. Avoid enabling this while debugging, because it can remove remote artifacts before you inspect them.
retry_count:#
type: int, optional, default: 3
argument path: run_mdata/fp/machine/retry_count
How many times DPDispatcher will retry a failed job before raising an error.
Depending on the value of context_type, different sub args are accepted.
context_type:#
type: str (flag key)
argument path: run_mdata/fp/machine/context_type
possible choices: HDFSContext, LazyLocalContext, SSHContext, OpenAPIContext, LocalContext, BohriumContext
Execution context / connection type used to reach the execution environment. Option: OpenAPIContext, BohriumContext, LazyLocalContext, SSHContext, LocalContext, HDFSContext
When context_type is set to HDFSContext (or its aliases hdfscontext, HDFS, hdfs):
remote_profile:#
type: dict, optional
argument path: run_mdata/fp/machine[HDFSContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to LazyLocalContext (or its aliases lazylocalcontext, LazyLocal, lazylocal):
remote_profile:#
type: dict, optional
argument path: run_mdata/fp/machine[LazyLocalContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to SSHContext (or its aliases sshcontext, SSH, ssh):
remote_profile:#
type: dict
argument path: run_mdata/fp/machine[SSHContext]/remote_profile
SSH connection settings for the remote machine, including authentication, timeouts, and optional proxy/jump-host behavior.
hostname:#
type: str
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/hostname
Hostname or IP address of the SSH target machine.
username:#
type: str
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/username
Username used to log in to the target system.
password:#
type: str, optional
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/password
(deprecated) password of linux system. Please use SSH keys instead to improve security.
port:#
type: int, optional, default: 22
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/port
SSH port of the target machine. Usually 22.
key_filename:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/key_filename
Path to the private key file used for SSH authentication. If left None, DPDispatcher can try discoverable keys in ~/.ssh or fall back to password-based login if configured.
passphrase:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/passphrase
Passphrase for the SSH private key, if the key is encrypted.
timeout:#
type: int, optional, default: 10
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/timeout
Timeout in seconds for establishing the SSH connection.
totp_secret:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/totp_secret
Time-based one-time-password secret used for keyboard-interactive 2FA. It should be a base32-encoded string.
tar_compress:#
type: bool, optional, default: True
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/tar_compress
Whether upload/download tar archives are compressed. Keeping this True usually reduces transfer size at the cost of extra CPU time.
look_for_keys:#
type: bool, optional, default: True
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/look_for_keys
Whether to search for discoverable private key files in ~/.ssh when key_filename is not provided.
execute_command:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/execute_command
Optional command executed immediately after the SSH connection is established.
proxy_command:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/machine[SSHContext]/remote_profile/proxy_command
Optional SSH ProxyCommand used to reach the target through an intermediate host or tunnel.
When context_type is set to OpenAPIContext (or its aliases openapicontext, OpenAPI, openapi):
remote_profile:#
type: dict, optional
argument path: run_mdata/fp/machine[OpenAPIContext]/remote_profile
The information used to maintain the connection with remote machine. This field is empty for this context.
When context_type is set to LocalContext (or its aliases localcontext, Local, local):
remote_profile:#
type: dict, optional
argument path: run_mdata/fp/machine[LocalContext]/remote_profile
Options controlling how files are staged between local_root and remote_root when both paths are on the local filesystem.
symlink:#
type: bool, optional, default: True
argument path: run_mdata/fp/machine[LocalContext]/remote_profile/symlink
Whether to use symbolic links instead of copying files from local_root into remote_root. Disable this when the execution side cannot access the original local path through the same filesystem view.
When context_type is set to BohriumContext (or its aliases bohriumcontext, Bohrium, bohrium, DpCloudServerContext, dpcloudservercontext, DpCloudServer, dpcloudserver, LebesgueContext, lebesguecontext, Lebesgue, lebesgue):
remote_profile:#
type: dict
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile
Configuration for Bohrium submission, including login credentials, project selection, and job-handling behavior.
email:#
type: str, optional
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile/email
Email address used to log in to Bohrium.
password:#
type: str, optional
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile/password
Password used together with email or phone login. If BOHR_TICKET is set, password-based login can be skipped.
phone:#
type: str, optional
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile/phone
Phone number used to log in when email is not used.
program_id:#
type: int, alias: project_id
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile/program_id
Program / project ID used to place uploaded jobs under the correct Bohrium project namespace.
retry_count:#
type: NoneType | int, optional, default: 2
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile/retry_count
How many times a terminated remote job is retried on the platform side before giving up.
ignore_exit_code:#
type: bool, optional, default: True
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile/ignore_exit_code
Whether a non-zero exit code from the remote platform is still treated as finished. If False, such jobs are marked as terminated.
keep_backup:#
type: bool, optional
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile/keep_backup
Whether to keep uploaded/downloaded zip archives in the local backup directory after transfer.
input_data:#
type: dict
argument path: run_mdata/fp/machine[BohriumContext]/remote_profile/input_data
Platform-specific job configuration passed through to the Bohrium API.
resources:#
type: dict
argument path: run_mdata/fp/resources
number_node:#
type: int, optional, default: 1
argument path: run_mdata/fp/resources/number_node
Number of nodes requested for each scheduler job generated by DPDispatcher.
cpu_per_node:#
type: int, optional, default: 1
argument path: run_mdata/fp/resources/cpu_per_node
Number of CPUs requested on each node for each scheduler job.
gpu_per_node:#
type: int, optional, default: 0
argument path: run_mdata/fp/resources/gpu_per_node
Number of GPUs requested on each node for each scheduler job.
queue_name:#
type: str, optional, default: (empty string)
argument path: run_mdata/fp/resources/queue_name
Queue or partition name used by the selected batch system. For local Shell runs this is usually an empty string; for Slurm it typically maps to a partition.
group_size:#
type: int
argument path: run_mdata/fp/resources/group_size
How many tasks are packed into one scheduler job. For example, 20 tasks with group_size=5 are typically split into 4 jobs. Use 1 for the simplest one-task workflow. 0 means no explicit upper limit in the grouping logic.
custom_flags:#
type: typing.List[str], optional
argument path: run_mdata/fp/resources/custom_flags
Extra scheduler-header lines inserted into the generated submission script, typically for backend-specific options that are not covered by the standard fields.
strategy:#
type: dict, optional
argument path: run_mdata/fp/resources/strategy
Strategy options that affect how DPDispatcher generates and evaluates submission scripts.
if_cuda_multi_devices:#
type: bool, optional, default: False
argument path: run_mdata/fp/resources/strategy/if_cuda_multi_devices
If a node has multiple NVIDIA GPUs, assign different tasks inside the same job to different GPUs by setting CUDA_VISIBLE_DEVICES automatically. Usually used together with para_deg > 1 and task-level resource awareness.
ratio_unfinished:#
type: float, optional, default: 0.0
argument path: run_mdata/fp/resources/strategy/ratio_unfinished
Maximum fraction of tasks allowed to remain unfinished when evaluating job completion. Use 0.0 for the strict default that requires every task to finish.
customized_script_header_template_file:#
type: str, optional
argument path: run_mdata/fp/resources/strategy/customized_script_header_template_file
Custom template file for the scheduler-header portion of generated submission scripts. Overrides the default template.
para_deg:#
type: int, optional, default: 1
argument path: run_mdata/fp/resources/para_deg
How many tasks inside one generated job are run in parallel. This is different from group_size: group_size controls how many tasks are bundled into a job, while para_deg controls concurrency within that job. Keep para_deg=1 for the safest default.
source_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/fp/resources/source_list
Shell scripts or environment files sourced before task commands run. Useful on HPC systems for activating software stacks explicitly instead of relying on login-shell defaults.
module_purge:#
type: bool, optional, default: False
argument path: run_mdata/fp/resources/module_purge
Whether to run ‘module purge’ before applying module_unload_list and module_list. Mainly useful on HPC systems.
module_unload_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/fp/resources/module_unload_list
Modules to unload before loading the requested modules. Mainly relevant on HPC systems with environment modules.
module_list:#
type: typing.List[str], optional, default: []
argument path: run_mdata/fp/resources/module_list
Modules to load before executing tasks. Mainly relevant on HPC systems with environment modules.
envs:#
type: dict, optional, default: {}
argument path: run_mdata/fp/resources/envs
Environment variables exported before executing tasks.
prepend_script:#
type: typing.List[str], optional, default: []
argument path: run_mdata/fp/resources/prepend_script
Optional shell lines inserted before task commands in the generated job script.
append_script:#
type: typing.List[str], optional, default: []
argument path: run_mdata/fp/resources/append_script
Optional shell lines inserted after task commands in the generated job script.
wait_time:#
type: float | int, optional, default: 0
argument path: run_mdata/fp/resources/wait_time
Delay in seconds inserted after a job is submitted or resubmitted. Usually keep 0 unless the scheduler/site asks you to throttle submission pace.
Depending on the value of batch_type, different sub args are accepted.
batch_type:#
type: str (flag key)
argument path: run_mdata/fp/resources/batch_type
possible choices: Torque, Fugaku, Bohrium, Shell, OpenAPI, SlurmJobArray, SGE, Slurm, LSF, JH_UniScheduler, DistributedShell, PBS
The batch job system type loaded from machine/batch_type.
When batch_type is set to Torque (or its alias torque):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[Torque]/kwargs
This field is empty for this batch.
When batch_type is set to Fugaku (or its alias fugaku):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[Fugaku]/kwargs
This field is empty for this batch.
When batch_type is set to Bohrium (or its aliases bohrium, Lebesgue, lebesgue, DpCloudServer, dpcloudserver):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[Bohrium]/kwargs
This field is empty for this batch.
When batch_type is set to Shell (or its alias shell):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[Shell]/kwargs
This field is empty for this batch.
When batch_type is set to OpenAPI (or its alias openapi):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[OpenAPI]/kwargs
This field is empty for this batch.
When batch_type is set to SlurmJobArray (or its alias slurmjobarray):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[SlurmJobArray]/kwargs
Slurm-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/resources[SlurmJobArray]/kwargs/custom_gpu_line
Custom GPU header line starting with #SBATCH. When set, it overrides DPDispatcher’s default Slurm GPU line generated from gpu_per_node.
slurm_job_size:#
type: int, optional, default: 1
argument path: run_mdata/fp/resources[SlurmJobArray]/kwargs/slurm_job_size
For SlurmJobArray, how many DPDispatcher tasks are grouped into one array element / Slurm job script branch.
When batch_type is set to SGE (or its alias sge):
kwargs:#
type: dict
argument path: run_mdata/fp/resources[SGE]/kwargs
SGE-specific extra arguments.
pe_name:#
type: str, optional, default: mpi, alias: sge_pe_name
argument path: run_mdata/fp/resources[SGE]/kwargs/pe_name
Parallel environment name used by SGE, for example mpi. This controls the #$ -pe … header line in SGE mode.
job_name:#
type: str, optional, default: wDPjob
argument path: run_mdata/fp/resources[SGE]/kwargs/job_name
Job name shown by SGE for this submission.
When batch_type is set to Slurm (or its alias slurm):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[Slurm]/kwargs
Slurm-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/resources[Slurm]/kwargs/custom_gpu_line
Custom GPU header line starting with #SBATCH. When set, it overrides DPDispatcher’s default Slurm GPU line generated from gpu_per_node.
When batch_type is set to LSF (or its alias lsf):
kwargs:#
type: dict
argument path: run_mdata/fp/resources[LSF]/kwargs
LSF-specific extra arguments.
gpu_usage:#
type: bool, optional, default: False
argument path: run_mdata/fp/resources[LSF]/kwargs/gpu_usage
Whether DPDispatcher should emit an LSF GPU request line at all. If False, no GPU request header is added.
gpu_new_syntax:#
type: bool, optional, default: False
argument path: run_mdata/fp/resources[LSF]/kwargs/gpu_new_syntax
Whether to use the newer #BSUB -gpu syntax instead of the older resource string syntax. This is typically used on newer LSF versions.
gpu_exclusive:#
type: bool, optional, default: True
argument path: run_mdata/fp/resources[LSF]/kwargs/gpu_exclusive
Only meaningful when gpu_new_syntax is enabled. Controls whether the submitted job requests GPUs in exclusive mode.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/resources[LSF]/kwargs/custom_gpu_line
Custom GPU header line starting with #BSUB. When set, it overrides the GPU-related LSF header generated from the other GPU kwargs.
When batch_type is set to JH_UniScheduler (or its alias jh_unischeduler):
kwargs:#
type: dict
argument path: run_mdata/fp/resources[JH_UniScheduler]/kwargs
JH_UniScheduler-specific extra arguments.
custom_gpu_line:#
type: str | NoneType, optional, default: None
argument path: run_mdata/fp/resources[JH_UniScheduler]/kwargs/custom_gpu_line
Custom GPU header line starting with #JSUB. When set, it overrides the default UniScheduler GPU line generated from gpu_per_node.
When batch_type is set to DistributedShell (or its alias distributedshell):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[DistributedShell]/kwargs
This field is empty for this batch.
When batch_type is set to PBS (or its alias pbs):
kwargs:#
type: dict, optional
argument path: run_mdata/fp/resources[PBS]/kwargs
This field is empty for this batch.
user_forward_files:#
type: list, optional
argument path: run_mdata/fp/user_forward_files
Files to be forwarded to the remote machine.
user_backward_files:#
type: list, optional
argument path: run_mdata/fp/user_backward_files
Files to be backwarded from the remote machine.