Development Notes
=================

Module layout
-------------

The model wrappers keep DeePMD-facing APIs in the top-level model modules and
move reusable glue code into small helpers:

``deepmd_gnn.mace``
    The DeePMD ``BaseModel`` implementation for MACE. It owns parameter
    serialization, statistics, lower-interface forwarding, force/virial
    gradients, export hooks, and MPI feature communication.

``deepmd_gnn.mace_network``
    Construction and conversion of the underlying MACE ``ScaleShiftMACE``
    module. Keep MACE package imports, cuEquivariance configuration, scripted
    model creation, and cueq-to-e3nn weight transfer here instead of mixing them
    into the DeePMD wrapper.

``deepmd_gnn.export``
    Shared helpers for PyTorch exportable backend traces. This includes neighbor
    list padding for symbolic shapes and one-shot cleanup of guards generated by
    ``torch.export``.

``deepmd_gnn.deepmd_ops``
    Compatibility hooks for DeePMD custom PyTorch ops. The placeholder
    ``border_op`` exists only so TorchScript can compile wrappers before the
    real DeePMD extension is loaded.

``deepmd_gnn.stat_compat``
    Version-tolerant access to DeePMD observed-type statistic helpers used by
    both MACE and NequIP wrappers.

MACE forward path
-----------------

``MaceModel.forward_lower_common`` is intentionally kept in ``mace.py`` because
it is the main adapter between DeePMD tensors and MACE graph tensors. When
changing it, keep these stages separate:

1. Normalize DeePMD lower-interface tensors and build an edge index.
2. Convert extended ghost atoms back to mapped local atoms when no MPI
   communication dictionary is available.
3. Build the MACE input graph and optional displacement tensor used for virial
   gradients.
4. Select the energy path: eager MACE, exportable module-by-module execution, or
   layer-wise MPI communication.
5. Differentiate the reduced energy to produce DeePMD-compatible force and
   virial outputs.

Avoid mixing export-only logic into the normal eager path unless the same
behavior is required by training, inference, and freezing. The export path has
different constraints because it must be traceable with symbolic atom counts.
