deepmd.jax.descriptor.dpa1
==========================

.. py:module:: deepmd.jax.descriptor.dpa1


Classes
-------

.. autoapisummary::

   deepmd.jax.descriptor.dpa1.GatedAttentionLayer
   deepmd.jax.descriptor.dpa1.NeighborGatedAttentionLayer
   deepmd.jax.descriptor.dpa1.NeighborGatedAttention
   deepmd.jax.descriptor.dpa1.DescrptBlockSeAtten
   deepmd.jax.descriptor.dpa1.DescrptDPA1


Module Contents
---------------

.. py:class:: GatedAttentionLayer(nnei: int, embed_dim: int, hidden_dim: int, num_heads: int = 1, dotr: bool = False, do_mask: bool = False, scaling_factor: float = 1.0, normalize: bool = True, temperature: Optional[float] = None, bias: bool = True, smooth: bool = True, precision: str = DEFAULT_PRECISION, seed: Optional[Union[int, list[int]]] = None, trainable: bool = True)

   Bases: :py:obj:`deepmd.dpmodel.descriptor.dpa1.GatedAttentionLayer`


   The unit operation of a native model.


   ..
       !! processed by numpydoc !!

   .. py:method:: __setattr__(name: str, value: Any) -> None


.. py:class:: NeighborGatedAttentionLayer(nnei: int, embed_dim: int, hidden_dim: int, dotr: bool = False, do_mask: bool = False, scaling_factor: float = 1.0, normalize: bool = True, temperature: Optional[float] = None, trainable_ln: bool = True, ln_eps: float = 1e-05, smooth: bool = True, precision: str = DEFAULT_PRECISION, seed: Optional[Union[int, list[int]]] = None, trainable: bool = True)

   Bases: :py:obj:`deepmd.dpmodel.descriptor.dpa1.NeighborGatedAttentionLayer`


   The unit operation of a native model.


   ..
       !! processed by numpydoc !!

   .. py:method:: __setattr__(name: str, value: Any) -> None


.. py:class:: NeighborGatedAttention(layer_num: int, nnei: int, embed_dim: int, hidden_dim: int, dotr: bool = False, do_mask: bool = False, scaling_factor: float = 1.0, normalize: bool = True, temperature: Optional[float] = None, trainable_ln: bool = True, ln_eps: float = 1e-05, smooth: bool = True, precision: str = DEFAULT_PRECISION, seed: Optional[Union[int, list[int]]] = None, trainable: bool = True)

   Bases: :py:obj:`deepmd.dpmodel.descriptor.dpa1.NeighborGatedAttention`


   The unit operation of a native model.


   ..
       !! processed by numpydoc !!

   .. py:method:: __setattr__(name: str, value: Any) -> None


.. py:class:: DescrptBlockSeAtten(rcut: float, rcut_smth: float, sel: Union[list[int], int], ntypes: int, neuron: list[int] = [25, 50, 100], axis_neuron: int = 8, tebd_dim: int = 8, tebd_input_mode: str = 'concat', resnet_dt: bool = False, type_one_side: bool = False, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, exclude_types: list[tuple[int, int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = DEFAULT_PRECISION, scaling_factor: float = 1.0, normalize: bool = True, temperature: Optional[float] = None, trainable_ln: bool = True, ln_eps: Optional[float] = 1e-05, smooth: bool = True, seed: Optional[Union[int, list[int]]] = None, trainable: bool = True)

   Bases: :py:obj:`deepmd.dpmodel.descriptor.dpa1.DescrptBlockSeAtten`


   The unit operation of a native model.


   ..
       !! processed by numpydoc !!

   .. py:method:: __setattr__(name: str, value: Any) -> None


.. py:class:: DescrptDPA1(rcut: float, rcut_smth: float, sel: Union[list[int], int], ntypes: int, neuron: list[int] = [25, 50, 100], axis_neuron: int = 8, tebd_dim: int = 8, tebd_input_mode: str = 'concat', resnet_dt: bool = False, trainable: bool = True, type_one_side: bool = False, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, exclude_types: list[tuple[int, int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = DEFAULT_PRECISION, scaling_factor: float = 1.0, normalize: bool = True, temperature: Optional[float] = None, trainable_ln: bool = True, ln_eps: Optional[float] = 1e-05, smooth_type_embedding: bool = True, concat_output_tebd: bool = True, spin: None = None, stripped_type_embedding: Optional[bool] = None, use_econf_tebd: bool = False, use_tebd_bias: bool = False, type_map: Optional[list[str]] = None, seed: Optional[Union[int, list[int]]] = None)

   Bases: :py:obj:`deepmd.dpmodel.descriptor.dpa1.DescrptDPA1`


   Attention-based descriptor which is proposed in the pretrainable DPA-1[1] model.

   This descriptor, :math:`\mathcal{D}^i \in \mathbb{R}^{M \times M_{<}}`, is given by

   .. math::
       \mathcal{D}^i = \frac{1}{N_c^2}(\hat{\mathcal{G}}^i)^T \mathcal{R}^i (\mathcal{R}^i)^T \hat{\mathcal{G}}^i_<,

   where :math:`\hat{\mathcal{G}}^i` represents the embedding matrix:math:`\mathcal{G}^i`
   after additional self-attention mechanism and :math:`\mathcal{R}^i` is defined by the full case in the se_e2_a descriptor.
   Note that we obtain :math:`\mathcal{G}^i` using the type embedding method by default in this descriptor.

   To perform the self-attention mechanism, the queries :math:`\mathcal{Q}^{i,l} \in \mathbb{R}^{N_c\times d_k}`,
   keys :math:`\mathcal{K}^{i,l} \in \mathbb{R}^{N_c\times d_k}`,
   and values :math:`\mathcal{V}^{i,l} \in \mathbb{R}^{N_c\times d_v}` are first obtained:

   .. math::
       \left(\mathcal{Q}^{i,l}\right)_{j}=Q_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),

   .. math::
       \left(\mathcal{K}^{i,l}\right)_{j}=K_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),

   .. math::
       \left(\mathcal{V}^{i,l}\right)_{j}=V_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),

   where :math:`Q_{l}`, :math:`K_{l}`, :math:`V_{l}` represent three trainable linear transformations
   that output the queries and keys of dimension :math:`d_k` and values of dimension :math:`d_v`, and :math:`l`
   is the index of the attention layer.
   The input embedding matrix to the attention layers,  denoted by :math:`\mathcal{G}^{i,0}`,
   is chosen as the two-body embedding matrix.

   Then the scaled dot-product attention method is adopted:

   .. math::
       A(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}, \mathcal{V}^{i,l}, \mathcal{R}^{i,l})=\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right)\mathcal{V}^{i,l},

   where :math:`\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right) \in \mathbb{R}^{N_c\times N_c}` is attention weights.
   In the original attention method,
   one typically has :math:`\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}\right)=\mathrm{softmax}\left(\frac{\mathcal{Q}^{i,l} (\mathcal{K}^{i,l})^{T}}{\sqrt{d_{k}}}\right)`,
   with :math:`\sqrt{d_{k}}` being the normalization temperature.
   This is slightly modified to incorporate the angular information:

   .. math::
       \varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right) = \mathrm{softmax}\left(\frac{\mathcal{Q}^{i,l} (\mathcal{K}^{i,l})^{T}}{\sqrt{d_{k}}}\right) \odot \hat{\mathcal{R}}^{i}(\hat{\mathcal{R}}^{i})^{T},

   where :math:`\hat{\mathcal{R}}^{i} \in \mathbb{R}^{N_c\times 3}` denotes normalized relative coordinates,
    :math:`\hat{\mathcal{R}}^{i}_{j} = \frac{\boldsymbol{r}_{ij}}{\lVert \boldsymbol{r}_{ij} \lVert}`
    and :math:`\odot` means element-wise multiplication.

   Then layer normalization is added in a residual way to finally obtain the self-attention local embedding matrix
    :math:`\hat{\mathcal{G}}^{i} = \mathcal{G}^{i,L_a}` after :math:`L_a` attention layers:[^1]

   .. math::
       \mathcal{G}^{i,l} = \mathcal{G}^{i,l-1} + \mathrm{LayerNorm}(A(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}, \mathcal{V}^{i,l}, \mathcal{R}^{i,l})).

   :Parameters:

       **rcut: float**
           The cut-off radius :math:`r_c`

       **rcut_smth: float**
           From where the environment matrix should be smoothed :math:`r_s`

       **sel** : :class:`python:list`\[:class:`python:int`], :class:`python:int`
           list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius
           int: the total maxmum number of atoms in the cut-off radius

       **ntypes** : :class:`python:int`
           Number of element types

       **neuron** : :class:`python:list`\[:class:`python:int`]
           Number of neurons in each hidden layers of the embedding net :math:`\mathcal{N}`

       **axis_neuron: int**
           Number of the axis neuron :math:`M_2` (number of columns of the sub-matrix of the embedding matrix)

       **tebd_dim: int**
           Dimension of the type embedding

       **tebd_input_mode: str**
           The input mode of the type embedding. Supported modes are ["concat", "strip"].
           - "concat": Concatenate the type embedding with the smoothed radial information as the union input for the embedding network.
           - "strip": Use a separated embedding network for the type embedding and combine the output with the radial embedding network output.

       **resnet_dt: bool**
           Time-step `dt` in the resnet construction:
           y = x + dt * \phi (Wx + b)

       **trainable: bool**
           If the weights of this descriptors are trainable.

       **trainable_ln: bool**
           Whether to use trainable shift and scale weights in layer normalization.

       **ln_eps: float, Optional**
           The epsilon value for layer normalization.

       **type_one_side: bool**
           If 'False', type embeddings of both neighbor and central atoms are considered.
           If 'True', only type embeddings of neighbor atoms are considered.
           Default is 'False'.

       **attn: int**
           Hidden dimension of the attention vectors

       **attn_layer: int**
           Number of attention layers

       **attn_dotr: bool**
           If dot the angular gate to the attention weights

       **attn_mask: bool**
           (Only support False to keep consistent with other backend references.)
           (Not used in this version. True option is not implemented.)
           If mask the diagonal of attention weights

       **exclude_types** : :class:`python:list`\[:class:`python:list`\[:class:`python:int`]]
           The excluded pairs of types which have no interaction with each other.
           For example, `[[0, 1]]` means no interaction between type 0 and type 1.

       **env_protection: float**
           Protection parameter to prevent division by zero errors during environment matrix calculations.

       **set_davg_zero: bool**
           Set the shift of embedding net input to zero.

       **activation_function: str**
           The activation function in the embedding net. Supported options are |ACTIVATION_FN|

       **precision: str**
           The precision of the embedding net parameters. Supported options are |PRECISION|

       **scaling_factor: float**
           The scaling factor of normalization in calculations of attention weights.
           If `temperature` is None, the scaling of attention weights is (N_dim * scaling_factor)**0.5

       **normalize: bool**
           Whether to normalize the hidden vectors in attention weights calculation.

       **temperature: float**
           If not None, the scaling of attention weights is `temperature` itself.

       **smooth_type_embedding: bool**
           Whether to use smooth process in attention weights calculation.

       **concat_output_tebd: bool**
           Whether to concat type embedding at the output of the descriptor.

       **stripped_type_embedding: bool, Optional**
           (Deprecated, kept only for compatibility.)
           Whether to strip the type embedding into a separate embedding network.
           Setting this parameter to `True` is equivalent to setting `tebd_input_mode` to 'strip'.
           Setting it to `False` is equivalent to setting `tebd_input_mode` to 'concat'.
           The default value is `None`, which means the `tebd_input_mode` setting will be used instead.

       **use_econf_tebd: bool, Optional**
           Whether to use electronic configuration type embedding.

       **use_tebd_bias** : :ref:`bool <python:bltin-boolean-values>`, :obj:`Optional <typing.Optional>`
           Whether to use bias in the type embedding layer.

       **type_map: list[str], Optional**
           A list of strings. Give the name to each type of atoms.

       **spin**
           (Only support None to keep consistent with other backend references.)
           (Not used in this version. Not-none option is not implemented.)
           The old implementation of deepspin.


   .. rubric:: References

   .. [R1f1e4d9beaef-1] Duo Zhang, Hangrui Bi, Fu-Zhi Dai, Wanrun Jiang, Linfeng Zhang, and Han Wang. 2022.
      DPA-1: Pretraining of Attention-based Deep Potential Model for Molecular Simulation.
      arXiv preprint arXiv:2208.08236.

   .. only:: latex

      [R1f1e4d9beaef-1]_


   ..
       !! processed by numpydoc !!

   .. py:method:: __setattr__(name: str, value: Any) -> None