AI & ML Scaling Insight

Identifies that in-context reasoning over pretraining knowledge only emerges after specific types of fine-tuning, not from pretraining alone.

March 24, 2026

Original Paper

Understanding Contextual Recall in Transformers: How Finetuning Enables In-Context Reasoning over Pretraining Knowledge

Bhavya Vasudeva, Puneesh Deora, Alberto Bietti, Vatsal Sharan, Christos Thrampoulidis

arXiv · 2603.20969

The Takeaway

Provides a mechanistic explanation for why 'raw' models often fail at contextual recall despite 'knowing' the facts. It identifies the formation of low-dimensional latent encodings as the trigger for ICL, helping researchers optimize post-training for better reasoning.

From the abstract

Transformer-based language models excel at in-context learning (ICL), where they can adapt to new tasks based on contextual examples, without parameter updates. In a specific form of ICL, which we refer to as \textit{contextual recall}, models pretrained on open-ended text leverage pairwise examples to recall specific facts in novel prompt formats. We investigate whether contextual recall emerges from pretraining alone, what finetuning is required, and what mechanisms drive the necessary represe

Read the original paper →

← Back to today's papers