Run Python scripts

Contents

Run Python scripts#

DPDispatcher can be used to directly run a single Python script:

dpdisp run script.py

The script must include inline script metadata compliant with PEP 723. An example of the script is shown below.

 1# /// script
 2# # dpdispatcher doesn't use `requires-python` and `dependencies`
 3# requires-python = ">=3"
 4# dependencies = [
 5# ]
 6# [tool.dpdispatcher]
 7# work_base = "./"
 8# forward_common_files=[]
 9# backward_common_files=[]
10# [tool.dpdispatcher.machine]
11# batch_type = "Shell"
12# local_root = "./"
13# context_type = "LazyLocalContext"
14# [tool.dpdispatcher.resources]
15# number_node = 1
16# cpu_per_node = 1
17# gpu_per_node = 0
18# group_size = 0
19# [[tool.dpdispatcher.task_list]]
20# # no need to contain the script filename
21# # `uv run` supports dependencies declared in the PEP-723 format
22# # See: https://docs.astral.sh/uv/guides/scripts/
23# command = "uv run"
24# # can be a glob pattern
25# task_work_path = "./"
26# forward_files = []
27# backward_files = ["log"]
28# ///
29
30print("hello world!")

The PEP 723 metadata entries for tool.dpdispatcher are defined as follows:

pep723:#
type: dict
argument path: pep723

PEP 723 metadata

work_base:#
type: str, optional, default: ./
argument path: pep723/work_base

Base directory for the work

forward_common_files:#
type: typing.List[str], optional, default: []
argument path: pep723/forward_common_files

Common files to forward to the remote machine

backward_common_files:#
type: typing.List[str], optional, default: []
argument path: pep723/backward_common_files

Common files to backward from the remote machine

machine:#
type: dict
argument path: pep723/machine

Machine configuration. See related documentation for details.

batch_type:#
type: str
argument path: pep723/machine/batch_type

Batch backend used to execute jobs. Option: Shell, PBS, DistributedShell, JH_UniScheduler, SlurmJobArray, LSF, OpenAPI, SGE, Torque, Slurm, Bohrium, Fugaku

local_root:#
type: str | NoneType
argument path: pep723/machine/local_root

Local project root used by DPDispatcher to find task directories and local files. If submission.work_base is a relative path, it is resolved inside this directory; if submission.work_base is absolute, it is used as-is and local_root is ignored.

remote_root:#
type: str | NoneType, optional
argument path: pep723/machine/remote_root

Remote root directory used by non-local contexts such as SSH. DPDispatcher creates and uses a submission-specific working directory beneath this root on the remote side. For SSHContext, this path should be absolute.

clean_asynchronously:#
type: bool, optional, default: False
argument path: pep723/machine/clean_asynchronously

Clean the remote working directory asynchronously after the job finishes. Avoid enabling this while debugging, because it can remove remote artifacts before you inspect them.

retry_count:#
type: int, optional, default: 3
argument path: pep723/machine/retry_count

How many times DPDispatcher will retry a failed job before raising an error.

Depending on the value of context_type, different sub args are accepted.

context_type:#

Execution context / connection type used to reach the execution environment. Option: LazyLocalContext, SSHContext, BohriumContext, OpenAPIContext, LocalContext, HDFSContext

When |flag:pep723/machine/context_type|_ is set to BohriumContext (or its aliases bohriumcontext, Bohrium, bohrium, DpCloudServerContext, dpcloudservercontext, DpCloudServer, dpcloudserver, LebesgueContext, lebesguecontext, Lebesgue, lebesgue):

remote_profile:#
type: dict
argument path: pep723/machine[BohriumContext]/remote_profile

Configuration for Bohrium submission, including login credentials, project selection, and job-handling behavior.

email:#
type: str, optional
argument path: pep723/machine[BohriumContext]/remote_profile/email

Email address used to log in to Bohrium.

password:#
type: str, optional
argument path: pep723/machine[BohriumContext]/remote_profile/password

Password used together with email or phone login. If BOHR_TICKET is set, password-based login can be skipped.

phone:#
type: str, optional
argument path: pep723/machine[BohriumContext]/remote_profile/phone

Phone number used to log in when email is not used.

program_id:#
type: int, alias: project_id
argument path: pep723/machine[BohriumContext]/remote_profile/program_id

Program / project ID used to place uploaded jobs under the correct Bohrium project namespace.

retry_count:#
type: NoneType | int, optional, default: 2
argument path: pep723/machine[BohriumContext]/remote_profile/retry_count

How many times a terminated remote job is retried on the platform side before giving up.

ignore_exit_code:#
type: bool, optional, default: True
argument path: pep723/machine[BohriumContext]/remote_profile/ignore_exit_code

Whether a non-zero exit code from the remote platform is still treated as finished. If False, such jobs are marked as terminated.

keep_backup:#
type: bool, optional
argument path: pep723/machine[BohriumContext]/remote_profile/keep_backup

Whether to keep uploaded/downloaded zip archives in the local backup directory after transfer.

input_data:#
type: dict
argument path: pep723/machine[BohriumContext]/remote_profile/input_data

Platform-specific job configuration passed through to the Bohrium API.

When |flag:pep723/machine/context_type|_ is set to LocalContext (or its aliases localcontext, Local, local):

remote_profile:#
type: dict, optional
argument path: pep723/machine[LocalContext]/remote_profile

Options controlling how files are staged between local_root and remote_root when both paths are on the local filesystem.

When |flag:pep723/machine/context_type|_ is set to HDFSContext (or its aliases hdfscontext, HDFS, hdfs):

remote_profile:#
type: dict, optional
argument path: pep723/machine[HDFSContext]/remote_profile

The information used to maintain the connection with remote machine. This field is empty for this context.

When |flag:pep723/machine/context_type|_ is set to SSHContext (or its aliases sshcontext, SSH, ssh):

remote_profile:#
type: dict
argument path: pep723/machine[SSHContext]/remote_profile

SSH connection settings for the remote machine, including authentication, timeouts, and optional proxy/jump-host behavior.

hostname:#
type: str
argument path: pep723/machine[SSHContext]/remote_profile/hostname

Hostname or IP address of the SSH target machine.

username:#
type: str
argument path: pep723/machine[SSHContext]/remote_profile/username

Username used to log in to the target system.

password:#
type: str, optional
argument path: pep723/machine[SSHContext]/remote_profile/password

(deprecated) password of linux system. Please use SSH keys instead to improve security.

port:#
type: int, optional, default: 22
argument path: pep723/machine[SSHContext]/remote_profile/port

SSH port of the target machine. Usually 22.

key_filename:#
type: str | NoneType, optional, default: None
argument path: pep723/machine[SSHContext]/remote_profile/key_filename

Path to the private key file used for SSH authentication. If left None, DPDispatcher can try discoverable keys in ~/.ssh or fall back to password-based login if configured.

passphrase:#
type: str | NoneType, optional, default: None
argument path: pep723/machine[SSHContext]/remote_profile/passphrase

Passphrase for the SSH private key, if the key is encrypted.

timeout:#
type: int, optional, default: 10
argument path: pep723/machine[SSHContext]/remote_profile/timeout

Timeout in seconds for establishing the SSH connection.

totp_secret:#
type: str | NoneType, optional, default: None
argument path: pep723/machine[SSHContext]/remote_profile/totp_secret

Time-based one-time-password secret used for keyboard-interactive 2FA. It should be a base32-encoded string.

tar_compress:#
type: bool, optional, default: True
argument path: pep723/machine[SSHContext]/remote_profile/tar_compress

Whether upload/download tar archives are compressed. Keeping this True usually reduces transfer size at the cost of extra CPU time.

look_for_keys:#
type: bool, optional, default: True
argument path: pep723/machine[SSHContext]/remote_profile/look_for_keys

Whether to search for discoverable private key files in ~/.ssh when key_filename is not provided.

execute_command:#
type: str | NoneType, optional, default: None
argument path: pep723/machine[SSHContext]/remote_profile/execute_command

Optional command executed immediately after the SSH connection is established.

proxy_command:#
type: str | NoneType, optional, default: None
argument path: pep723/machine[SSHContext]/remote_profile/proxy_command

Optional SSH ProxyCommand used to reach the target through an intermediate host or tunnel.

When |flag:pep723/machine/context_type|_ is set to OpenAPIContext (or its aliases openapicontext, OpenAPI, openapi):

remote_profile:#
type: dict, optional
argument path: pep723/machine[OpenAPIContext]/remote_profile

The information used to maintain the connection with remote machine. This field is empty for this context.

When |flag:pep723/machine/context_type|_ is set to LazyLocalContext (or its aliases lazylocalcontext, LazyLocal, lazylocal):

remote_profile:#
type: dict, optional
argument path: pep723/machine[LazyLocalContext]/remote_profile

The information used to maintain the connection with remote machine. This field is empty for this context.

resources:#
type: dict
argument path: pep723/resources

Resources configuration. See related documentation for details.

number_node:#
type: int, optional, default: 1
argument path: pep723/resources/number_node

Number of nodes requested for each scheduler job generated by DPDispatcher.

cpu_per_node:#
type: int, optional, default: 1
argument path: pep723/resources/cpu_per_node

Number of CPUs requested on each node for each scheduler job.

gpu_per_node:#
type: int, optional, default: 0
argument path: pep723/resources/gpu_per_node

Number of GPUs requested on each node for each scheduler job.

queue_name:#
type: str, optional, default: (empty string)
argument path: pep723/resources/queue_name

Queue or partition name used by the selected batch system. For local Shell runs this is usually an empty string; for Slurm it typically maps to a partition.

group_size:#
type: int
argument path: pep723/resources/group_size

How many tasks are packed into one scheduler job. For example, 20 tasks with group_size=5 are typically split into 4 jobs. Use 1 for the simplest one-task workflow. 0 means no explicit upper limit in the grouping logic.

custom_flags:#
type: typing.List[str], optional
argument path: pep723/resources/custom_flags

Extra scheduler-header lines inserted into the generated submission script, typically for backend-specific options that are not covered by the standard fields.

strategy:#
type: dict, optional
argument path: pep723/resources/strategy

Strategy options that affect how DPDispatcher generates and evaluates submission scripts.

if_cuda_multi_devices:#
type: bool, optional, default: False
argument path: pep723/resources/strategy/if_cuda_multi_devices

If a node has multiple NVIDIA GPUs, assign different tasks inside the same job to different GPUs by setting CUDA_VISIBLE_DEVICES automatically. Usually used together with para_deg > 1 and task-level resource awareness.

ratio_unfinished:#
type: float, optional, default: 0.0
argument path: pep723/resources/strategy/ratio_unfinished

Maximum fraction of tasks allowed to remain unfinished when evaluating job completion. Use 0.0 for the strict default that requires every task to finish.

customized_script_header_template_file:#
type: str, optional
argument path: pep723/resources/strategy/customized_script_header_template_file

Custom template file for the scheduler-header portion of generated submission scripts. Overrides the default template.

para_deg:#
type: int, optional, default: 1
argument path: pep723/resources/para_deg

How many tasks inside one generated job are run in parallel. This is different from group_size: group_size controls how many tasks are bundled into a job, while para_deg controls concurrency within that job. Keep para_deg=1 for the safest default.

source_list:#
type: typing.List[str], optional, default: []
argument path: pep723/resources/source_list

Shell scripts or environment files sourced before task commands run. Useful on HPC systems for activating software stacks explicitly instead of relying on login-shell defaults.

module_purge:#
type: bool, optional, default: False
argument path: pep723/resources/module_purge

Whether to run ‘module purge’ before applying module_unload_list and module_list. Mainly useful on HPC systems.

module_unload_list:#
type: typing.List[str], optional, default: []
argument path: pep723/resources/module_unload_list

Modules to unload before loading the requested modules. Mainly relevant on HPC systems with environment modules.

module_list:#
type: typing.List[str], optional, default: []
argument path: pep723/resources/module_list

Modules to load before executing tasks. Mainly relevant on HPC systems with environment modules.

envs:#
type: dict, optional, default: {}
argument path: pep723/resources/envs

Environment variables exported before executing tasks.

prepend_script:#
type: typing.List[str], optional, default: []
argument path: pep723/resources/prepend_script

Optional shell lines inserted before task commands in the generated job script.

append_script:#
type: typing.List[str], optional, default: []
argument path: pep723/resources/append_script

Optional shell lines inserted after task commands in the generated job script.

wait_time:#
type: float | int, optional, default: 0
argument path: pep723/resources/wait_time

Delay in seconds inserted after a job is submitted or resubmitted. Usually keep 0 unless the scheduler/site asks you to throttle submission pace.

kwargs:#
type: dict, optional
argument path: pep723/resources/kwargs

Vary by different machines.

batch_type:#
type: str, optional
argument path: pep723/resources/batch_type

Allow this key when strict checking.

task_list:#
type: list
argument path: pep723/task_list

List of tasks to execute.

This argument takes a list with each element containing the following:

command:#
type: str, optional, default: python
argument path: pep723/task_list/command

Python interpreter or launcher. No need to contain the Python script filename.

task_work_path:#
type: str
argument path: pep723/task_list/task_work_path

Working directory of this task, specified as a relative path inside submission.work_base. Absolute paths are not supported and may break staging or remote execution. For the smallest local example, use ‘.’. If you use a subdirectory such as ‘task1/’, the command runs inside that subdirectory. Can be a glob pattern.

forward_files:#
type: typing.List[str], optional, default: []
argument path: pep723/task_list/forward_files

Files to upload for this task before execution. Paths are resolved relative to this task’s task_work_path. Put per-task inputs here; files shared by all tasks belong in submission.forward_common_files.

backward_files:#
type: typing.List[str], optional, default: []
argument path: pep723/task_list/backward_files

Files to download for this task after execution. Paths are collected from this task’s task_work_path on the execution side and synchronized back to the same relative task directory under the local staging root (typically machine.local_root/work_base).

outlog:#
type: str | NoneType, optional, default: log
argument path: pep723/task_list/outlog

Filename used to redirect stdout inside task_work_path while the task runs. If this file is downloaded or synchronized back, it typically appears under the same relative task directory on the local side.

errlog:#
type: str | NoneType, optional, default: err
argument path: pep723/task_list/errlog

Filename used to redirect stderr inside task_work_path while the task runs. If this file is downloaded or synchronized back, it typically appears under the same relative task directory on the local side.

$ref support#

dpdisp run supports loading external JSON/YAML snippets via $ref in tool.dpdispatcher metadata. For security reasons, this feature is disabled by default.

Enable explicitly with:

dpdisp run script.py --allow-ref