dpgen.remote package

Submodules

dpgen.remote.RemoteJob module

class dpgen.remote.RemoteJob.CloudMachineJob(ssh_session, local_root, job_uuid=None)[source]

Bases: RemoteJob

check_status()[source]
submit(job_dirs, cmd, args=None, resources=None)[source]
class dpgen.remote.RemoteJob.JobStatus(value)[source]

Bases: Enum

An enumeration.

finished = 5
running = 3
terminated = 4
unknown = 100
unsubmitted = 1
waiting = 2
class dpgen.remote.RemoteJob.LSFJob(ssh_session, local_root, job_uuid=None)[source]

Bases: RemoteJob

check_limit(task_max)[source]
check_status()[source]
submit(job_dirs, cmd, args=None, resources=None, restart=False)[source]
class dpgen.remote.RemoteJob.PBSJob(ssh_session, local_root, job_uuid=None)[source]

Bases: RemoteJob

check_status()[source]
submit(job_dirs, cmd, args=None, resources=None)[source]
class dpgen.remote.RemoteJob.RemoteJob(ssh_session, local_root, job_uuid=None)[source]

Bases: object

block_call(cmd)[source]
block_checkcall(cmd)[source]
clean()[source]
download(job_dirs, remote_down_files, back_error=False)[source]
get_job_root()[source]
upload(job_dirs, local_up_files, dereference=True)[source]
class dpgen.remote.RemoteJob.SSHSession(jdata)[source]

Bases: object

close()[source]
get_session_root()[source]
get_ssh_client()[source]
class dpgen.remote.RemoteJob.SlurmJob(ssh_session, local_root, job_uuid=None)[source]

Bases: RemoteJob

check_status()[source]
submit(job_dirs, cmd, args=None, resources=None, restart=False)[source]
class dpgen.remote.RemoteJob.awsMachineJob(remote_root, work_path, job_uuid=None)[source]

Bases: object

download(job_dir, remote_down_files, dereference=True)[source]
upload(job_dir, local_up_files, dereference=True)[source]

dpgen.remote.decide_machine module

dpgen.remote.decide_machine.convert_mdata(mdata, task_types=['train', 'model_devi', 'fp'])[source]

Convert mdata for DP-GEN main process. New convension is like mdata[“fp”][“machine”], DP-GEN needs mdata[“fp_machine”]

Notice that we deprecate the function which can automatically select one most avalaible machine, since this function was only used by Angus, and only supports for Slurm. In the future this can be implemented.

mdatadict

Machine parameters to be converted.

task_typeslist of string

Type of tasks, default is [“train”, “model_devi”, “fp”]

dict

mdata converted

dpgen.remote.group_jobs module

class dpgen.remote.group_jobs.PMap(path, fname='pmap.json')[source]

Bases: object

Path map class to operate {read,write,delte} the pmap.json file

delete()[source]
dump(pmap, indent=4)[source]
load()[source]
dpgen.remote.group_jobs.aws_submit_jobs(machine, resources, command, work_path, tasks, group_size, forward_common_files, forward_task_files, backward_task_files, forward_task_deference=True)[source]
dpgen.remote.group_jobs.group_local_jobs(ssh_sess, resources, command, work_path, tasks, group_size, forward_common_files, forward_task_files, backward_task_files, forward_task_deference=True)[source]
dpgen.remote.group_jobs.group_slurm_jobs(ssh_sess, resources, command, work_path, tasks, group_size, forward_common_files, forward_task_files, backward_task_files, remote_job=<class 'dpgen.remote.RemoteJob.SlurmJob'>, forward_task_deference=True)[source]
dpgen.remote.group_jobs.ucloud_submit_jobs(machine, resources, command, work_path, tasks, group_size, forward_common_files, forward_task_files, backward_task_files, forward_task_deference=True)[source]