AI & ML New Capability

Capability-Guided Compression uses Sparse Autoencoders (SAEs) to prevent 'capability loss' during model pruning and quantization.

arXiv · March 18, 2026 · 2603.16440

Rishaank Gupta

The Takeaway

Existing compression tools rely on perplexity, which often masks the loss of specific reasoning capabilities. By using 'capability density maps' from SAEs, practitioners can allocate compression budgets to protect functional components, avoiding the abrupt performance 'phase transitions' common in standard pruning.

From the abstract

Large language model compression has made substantial progress through pruning, quantization, and low-rank decomposition, yet a fundamental limitation persists across all existing methods: compression budgets are allocated without any representation of what individual model components functionally encode. We term this the capability-blind compression problem and argue it is a root cause of two well-documented failures -- the insensitivity of perplexity-based evaluation to reasoning capability lo