AI & ML Breaks Assumption

Masked Image Modeling (MIM) representations are fundamentally polluted with non-semantic noise, which can be fixed with a zero-cost post-hoc linear projection.

April 2, 2026

Original Paper

Suppressing Non-Semantic Noise in Masked Image Modeling Representations

Martine Hjelkrem-Tan, Marius Aasan, Rwiddhi Chakraborty, Gabriel Y. Arteaga, Changkyu Choi, Adín Ramírez Rivera

arXiv · 2604.00172

The Takeaway

It introduces SOAP, a method that requires no training and consistently improves zero-shot performance for models like MAE. This challenges the assumption that pre-training objectives alone produce clean semantic features and provides a 'free' performance boost for practitioners.

From the abstract

Masked Image Modeling (MIM) has become a ubiquitous self-supervised vision paradigm. In this work, we show that MIM objectives cause the learned representations to retain non-semantic information, which ultimately hurts performance during inference. We introduce a model-agnostic score for semantic invariance using Principal Component Analysis (PCA) on real and synthetic non-semantic images. Based on this score, we propose a simple method, Semantically Orthogonal Artifact Projection (SOAP), to di