Initiatives | Exorobourii

Select an Initiative

Explore our work in Mechanistic Interpretability and Efficient Nano-LLMs.

Current AI development is hindered by an "observability crisis"; models are treated as opaque engines judged only by external output. We are building the sensors required to see inside.

The VSM Protocol

The Vector-Space-Mapping (VSM) Protocol is a diagnostic framework designed to function as a "Mechanistic Stethoscope" for Transformer models.

[attachment_0](attachment)

By treating the Transformer not just as a computational graph, but as a physical system or "Resonant Cavity", we can insert sensors to read the internal state of attention mechanisms in real-time. This allows us to move beyond simple loss metrics and observe the actual structural organization of the model's "mind" as it learns.

Focus & Specialization

Our research maps the internal state of a model onto a 2D state space defined by two fundamental qualities:

Coherence (Focus): A measure of confidence. Are the attention heads sharp and certain, or diffuse and confused?
Agreement (Specialization): A measure of diversity. Are the heads redundant, doing the same work? Or have they formed a "Team of Specialists," each covering a unique feature subspace?

We have quantitatively proven that untrained models begin in a state of "Synchronized Confusion" (High Agreement, Low Focus) and must break this symmetry to learn.

Project Janus

Project Janus is an engineering initiative to maximize the "Expressive Bandwidth" of small-parameter models (Nano-LLMs).

Small models often suffer from "Attentional Collapse," where limited capacity leads to redundant feature learning. Janus aims to solve this by enforcing structural efficiency, allowing a 40M parameter model to punch far above its weight class.

"Janus achieves parity Loss with 28% less Redundancy."

Vector Space Homeostasis

We combat redundancy through a regularization technique called Vector Space Homeostasis.

By applying a "Diversity Pressure" during training, we actively penalize feature correlation between attention heads. This forces the model to maintain orthogonality in its vector space, ensuring that every parameter contributes a unique, non-redundant value to the computation.

Modernized Chassis

To support these advanced dynamics, we abandoned legacy architectures in favor of a modernized "Janus Chassis".

This architecture is engineered for stability and geometric preservation. It utilizes components like Rotary Positional Embeddings (ROPE) to preserve vector geometry, and RMSNorm to ensure a clean signal space. This provides the stable environment necessary for our diversity pressure mechanisms to operate effectively.

Guided Training Dynamics

We do not believe in static training. Our protocols utilize dynamic pressure schedules that guide the model through distinct phases of development.

By strategically applying and releasing diversity pressure, we allow the model to first form primitive circuits, then force it to "workout" and differentiate its features, before finally cooling down to refine its predictive capabilities. This results in a "Break from Symmetry" that is significantly faster and more complete than standard training.