Applies reinforcement learning with a cycle-consistency reward to drastically improve natural language to Lean4 autoformalization.
March 26, 2026
Original Paper
Improving Lean4 Autoformalization via Cycle Consistency Fine-tuning
arXiv · 2603.24372
The Takeaway
By utilizing a loop (NL to Lean to NL) to measure meaning preservation, the authors bypass the need for massive human-labeled formal proofs. This technique significantly outperforms standard supervised fine-tuning on benchmarks like PutnamBench, providing a path toward automated verification of mathematical literature.
From the abstract
Autoformalization - automatically translating natural language mathematical texts into formal proof language such as Lean4 - can help accelerate AI-assisted mathematical research, be it via proof verification or proof search. I fine-tune Qwen3.5-2B with LoRA for natural language to Lean4 formalization on FineLeanCorpus and consider three training regimes: supervised fine-tuning (SFT) with curriculum learning (difficulty 1 to 10), SFT without curriculum ordering, and reinforcement learning using