deepmd.pt.utils.learning_rate

deepmd.pt.utils.learning_rate#

Classes#

`BaseLR`	Helper class that provides a standard way to create an ABC using
`LearningRateCosine`	Cosine annealing learning rate schedule with optional warmup.
`LearningRateExp`	Exponential decay learning rate schedule with optional warmup.

Module Contents#

class deepmd.pt.utils.learning_rate.BaseLR(start_lr: float, num_steps: int, stop_lr: float | None = None, stop_lr_ratio: float | None = None, warmup_steps: int = 0, warmup_ratio: float | None = None, warmup_start_factor: float = 0.0, **kwargs: Any)[source]#

Bases: abc.ABC, deepmd.utils.plugin.PluginVariant, make_plugin_registry('lr')

Helper class that provides a standard way to create an ABC using inheritance.

warmup_start_lr#

_start_lr#

num_steps#

decay_num_steps#

property start_lr: float#

Get the starting learning rate.

Returns:

float: The starting learning rate.

abstractmethod _decay_value(step: int | deepmd.dpmodel.array_api.Array) → deepmd.dpmodel.array_api.Array[source]#

Get the decayed learning rate at the given step (after warmup).

This method should implement the actual decay logic (exp, cosine, etc.) without considering warmup.

Parameters:

stepint or Array: The step index relative to the end of warmup. For example, if warmup_steps=100 and total_step=150, this method will be called with step=50.

Returns:

Array: The decayed learning rate (absolute value, not factor).

value(step: int | deepmd.dpmodel.array_api.Array) → deepmd.dpmodel.array_api.Array | float[source]#

Get the learning rate at the given step, including warmup.

Parameters:

stepint or Array: The absolute step index from the start of training.

Returns:

Array: The learning rate at the given step.

class deepmd.pt.utils.learning_rate.LearningRateCosine(start_lr: float, num_steps: int, stop_lr: float | None = None, stop_lr_ratio: float | None = None, warmup_steps: int = 0, warmup_ratio: float | None = None, warmup_start_factor: float = 0.0, **kwargs: Any)[source]#

Bases: BaseLR

Cosine annealing learning rate schedule with optional warmup.

The decay phase (after warmup) follows the cosine annealing formula:

\[lr(t) = lr_{\text{stop}} + \frac{lr_0 - lr_{\text{stop}}}{2} \left(1 + \cos\left(\pi \frac{t}{T}\right)\right)\]

where: - \(lr_0\) is start_lr (learning rate at the start of decay phase) - \(lr_{\text{stop}}\) is stop_lr (minimum learning rate) - \(t\) is the step index within the decay phase - \(T = \text{num\_steps} - \text{warmup\_steps}\) is the total

number of decay steps

Equivalently, using \(\alpha = lr_{\text{stop}} / lr_0\):

\[lr(t) = lr_0 \cdot \left[\alpha + \frac{1}{2}(1 - \alpha) \left(1 + \cos\left(\pi \frac{t}{T}\right)\right)\right]\]

lr_min_factor#

_decay_value(step: int | deepmd.dpmodel.array_api.Array) → deepmd.dpmodel.array_api.Array[source]#

Get the cosine-annealed learning rate at the given step.

Parameters:

stepint or Array: The step index relative to the end of warmup.

Returns:

Array: The annealed learning rate (absolute value).

class deepmd.pt.utils.learning_rate.LearningRateExp(start_lr: float, num_steps: int, stop_lr: float | None = None, stop_lr_ratio: float | None = None, decay_steps: int = 5000, decay_rate: float | None = None, warmup_steps: int = 0, warmup_ratio: float | None = None, warmup_start_factor: float = 0.0, smooth: bool = False, **kwargs: Any)[source]#

Bases: BaseLR

Exponential decay learning rate schedule with optional warmup.

The decay phase (after warmup) follows the exponential decay formula.

Stepped mode (smooth=False, default):

\[lr(t) = lr_0 \cdot r^{\lfloor t / s \rfloor}\]

The learning rate decays every decay_steps steps, creating a staircase pattern.

Smooth mode (smooth=True):

\[lr(t) = lr_0 \cdot r^{t / s}\]

The learning rate decays continuously at every step.

where: - \(lr_0\) is start_lr (learning rate at the start of decay phase) - \(r\) is the decay rate decay_rate - \(t\) is the step index within the decay phase - \(s\) is decay_steps (the decay period)

The decay rate is automatically computed from start_lr and stop_lr over the total decay steps unless explicitly provided:

\[r = \left(\frac{lr_{\text{stop}}}{lr_0}\right)^{\frac{s}{T}}\]

where \(T = \text{num\_steps} - \text{warmup\_steps}\) is the total number of decay steps, and \(lr_{\text{stop}}\) is stop_lr.

decay_steps = 5000#

min_lr#

smooth = False#

_decay_value(step: int | deepmd.dpmodel.array_api.Array) → deepmd.dpmodel.array_api.Array[source]#

Get the exponential-decayed learning rate factor at the given step.

Parameters:

stepint or Array: The step index relative to the end of warmup.

Returns:

Array: The decayed learning rate (absolute value).