OpenMOSS Interpretability Research

We aim to break down neural networks into basic features and understand what they represent, how they are connected and how these structures come to be formed. Ultimately, we pursue the beauty in their internal structures.

Feb 12, 2026
Bridging the Attention Gap: Complete Replacement Models for Complete Circuit Tracing
With attention and MLP replacement layers, we completely sparsified a language model to see its underlying local and global circuits.

Bridging the Attention Gap: Complete Replacement Models for Complete Circuit Tracing