Finds that nominal instruction-tuning with LoRA often fails to improve (and can even degrade) verifiable instruction-following despite improvements on broader benchmarks.
March 25, 2026
Original Paper
Instruction-Tuned, but Not More Verifiable Instruction-Following: A Cross-Task Diagnosis for LoRA Adapters
arXiv · 2603.22379
The Takeaway
The paper identifies a 'capability drift' where the training objective's label does not match realized gains. This is a critical warning for practitioners that common fine-tuning recipes for instruction following may be failing to improve actual constraint-adherence in production.
From the abstract
Adapters are often selected and deployed based on nominal labels (e.g., instruction-tuned), which implicitly suggest what capability improves after adaptation. We test whether nominal training objectives reliably align with realized cross-task capability gains by evaluating the same LoRA adapter across tasks. Our strongest evidence is tied to strict, automatically verifiable instruction following as measured by IFEval: across multiple seeds, base models, and LoRA settings, nominal labels recurre