AI & ML Breaks Assumption

Routing signatures reveal that MoE experts are highly task-specific, allowing a simple linear classifier to identify task categories with 92.5% accuracy based only on routing patterns.

arXiv · March 13, 2026 · 2603.11114

Mynampati Sri Ranganadha Avinash

Why it matters

Challenges the assumption that sparse MoE routing is primarily a load-balancing or efficiency mechanism. It proves that experts specialize at a semantic level, suggesting that routing behavior can be used as a probe for model understanding or as a trigger for task-specific downstream interventions.

From the abstract

Sparse Mixture-of-Experts (MoE) architectures enable efficient scaling of large language models through conditional computation, yet the routing mechanisms responsible for expert selection remain poorly understood. In this work, we introduce routing signatures, a vector representation summarizing expert activation patterns across layers for a given prompt, and use them to study whether MoE routing exhibits task-conditioned structure. Using OLMoE-1B-7B-0125-Instruct as an empirical testbed, we sh