Code

Our code is here.

TL;DR;

  • We built a transformer variant whose residual stream is constrained to lie on the standard probability simplex, motivated by the interpretability advantages of having activations with a privileged basis and a natural probabilistic interpretation.
  • Computation is performed using the centered log-ratio (CLR) transform, which maps the simplex to the zero-sum subspace of Euclidean space. Attention and feed-forward layers operate in CLR space, and residual updates use Aitchison addition (log-space addition followed by renormalization), ensuring the state always remains a valid distribution.
  • This was an extension of our SPAR project on geometric constraints for interpretability. The linked repo is an isolated component from a larger private codebase, showcasing the implementation.