Module taking care of important package constants.

Module Contents



Class with info on how to run training (cluster, MPI and GPU config).

class str | None = None, init_frz_model: str | None = None, finetune: str | None = None, restart: str | None = None, log_path: str | None = None, log_level: int = 0, mpi_log: str = 'master')[source]

Class with info on how to run training (cluster, MPI and GPU config).

gpus: Optional[List[int]]

list of GPUs if any are present else None

is_chief: bool

in distribured training it is true for tha main MPI process in serail it is always true

world_size: int

total worker count

my_rank: int

index of the MPI task

nodename: str

name of the node


the list of nodes of the current mpirun

my_device: str

deviice type - gpu or cpu

property is_chief[source]

Whether my rank is 0.

gpus: List[int] | None[source]
world_size: int[source]
my_rank: int[source]
nodename: str[source]
nodelist: List[int][source]
my_device: str[source]
_HVD: horovod.tensorflow | None[source]
_log_handles_already_set: bool = False[source]

Print build and current running cluster configuration summary.

_setup_logger(log_path: pathlib.Path | None, log_level: int, mpi_log: str | None)[source]

Set up package loggers.


logging level


path to log file, if None logs will be send only to console. If the parent directory does not exist it will be automatically created, by default None

mpi_logOptional[str], optional

mpi log type. Has three options. master will output logs to file and console only from rank==0. collect will write messages from all ranks to one file opened under rank==0 and to console. workers will open one log file for each worker designated by its rank, console behaviour is the same as for collect.

_init_distributed(HVD: RunOptions._init_distributed.HVD)[source]

Initialize settings for distributed training.


horovod object


Initialize setting for serial training.