AI & ML New Capability

Enables semantically precise model editing directly in the weight space without any training data.

March 27, 2026

Original Paper

From Weights to Concepts: Data-Free Interpretability of CLIP via Singular Vector Decomposition

Francesco Gentile, Nicola Dall'Asen, Francesco Tonini, Massimiliano Mancini, Lorenzo Vaquero, Elisa Ricci

arXiv · 2603.24653

The Takeaway

By decomposing CLIP's attention heads using SVD and interpreting them via sparse concept mapping, practitioners can now suppress or amplify specific visual concepts by modifying weights directly, bypassing the need for expensive fine-tuning.

From the abstract

As vision-language models are deployed at scale, understanding their internal mechanisms becomes increasingly critical. Existing interpretability methods predominantly rely on activations, making them dataset-dependent, vulnerable to data bias, and often restricted to coarse head-level explanations. We introduce SITH (Semantic Inspection of Transformer Heads), a fully data-free, training-free framework that directly analyzes CLIP's vision transformer in weight space. For each attention head, we