AI & ML Breaks Assumption

Breaks the long-standing accuracy-robustness trade-off in VLMs by localizing adversarial robustness to shallow layers.

arXiv · March 16, 2026 · 2603.12799

Sen Nie, Jie Zhang, Zhongqi Wang, Zhaoyang Wei, Shiguang Shan, Xilin Chen

Why it matters

The paper finds that robustness is primarily a shallow-layer phenomenon driven by low-frequency spectral bias. By freezing pre-trained weights and only adapting initial layers (R-Adapt), researchers can equip models with robustness without the typical 10-20% drop in clean accuracy.

From the abstract

Achieving adversarial robustness in Vision-Language Models (VLMs) inevitably compromises accuracy on clean data, presenting a long-standing and challenging trade-off. In this work, we revisit this trade-off by investigating a fundamental question: What makes VLMs robust? Through a detailed analysis of adversarially fine-tuned models, we examine how robustness mechanisms function internally and how they interact with clean accuracy. Our analysis reveals that adversarial robustness is not uniforml

Read the original paper →

← Back to today's papers