A tiny transformer mastered SAT solving and binary multiplication by learning to execute a universal programming language.
April 29, 2026
Original Paper
Training Transformers as a Universal Computer
arXiv · 2604.25166
The Takeaway
Most practitioners treat transformers as statistical machines that predict the next word in a sequence. This experiment trained a small model on MicroPy, a simplified universal language, to see if it could act as a general-purpose computer. The model successfully generalized to execute novel programs it never saw during the training phase. This proves that the transformer architecture is turing-complete in a practical sense and can handle arbitrary algorithmic logic. We are moving toward a future where the distinction between a neural network and a traditional computer disappears.
From the abstract
We demonstrate that a small transformer can learn to execute programs in MicroPy, a simplified yet computationally universal programming language. Given procedure definitions together with an expression to evaluate, the transformer predicts small-step execution using PENCIL scaffolding for space-efficient execution within a bounded context window. After training on randomly generated, meaningless MicroPy programs, the learned transformer generalizes to various human-written programs including bi