AI & ML Breaks Assumption

Mechanistic analysis reveals that LLMs fail at character counting not because they lack the information, but because 'negative circuits' in the final layers actively suppress the correct answer.

April 2, 2026

Original Paper

From Early Encoding to Late Suppression: Interpreting LLMs on Character Counting Tasks

Ayan Datta, Mounika Marreddy, Alexander Mehler, Zhixue Zhao, Radhika Mamidi

arXiv · 2604.00778

The Takeaway

It challenges the idea that symbolic failures are due to a lack of data or scale. Practitioners can use this insight to improve reasoning by targeting specific late-layer MLP components rather than simply scaling or instruction-tuning.

From the abstract

Large language models (LLMs) exhibit failures on elementary symbolic tasks such as character counting in a word, despite excelling on complex benchmarks. Although this limitation has been noted, the internal reasons remain unclear. We use character counting (e.g., "How many p's are in apple?") as a minimal, controlled probe that isolates token-level reasoning from higher-level confounds. Using this setting, we uncover a consistent phenomenon across modern architectures, including LLaMA, Qwen, an

Read the original paper →

← Back to today's papers