Transformers, RNNs, and LSTMs all independently evolve the same periodic mathematical patterns to represent numbers.
April 23, 2026
Original Paper
Convergent Evolution: How Different Language Models Learn Similar Number Representations
arXiv · 2604.20817
The Takeaway
Different AI architectures converge on a single natural way to understand mathematics when trained on human text. This periodic representation appears across diverse species of models, much like eyes evolved multiple times in nature. It suggests there is a fundamental mathematical structure to language that any intelligent system will eventually discover. We previously thought AI representations were arbitrary and model-specific. This finding implies that the way AI sees math might be a universal constant for intelligence.
From the abstract
Language models trained on natural text learn to represent numbers using periodic features with dominant periods at $T=2, 5, 10$. In this paper, we identify a two-tiered hierarchy of these features: while Transformers, Linear RNNs, LSTMs, and classical word embeddings trained in different ways all learn features that have period-$T$ spikes in the Fourier domain, only some learn geometrically separable features that can be used to linearly classify a number mod-$T$. To explain this incongruity, w