Expands formal reasoning beyond proof construction to the generation and formal verification of counterexamples in Lean 4.
March 23, 2026
Original Paper
Learning to Disprove: Formal Counterexample Generation with Large Language Models
arXiv · 2603.19514
The Takeaway
Mathematical reasoning in AI has focused almost exclusively on proving true statements; this systemically addresses the 'other half' of the problem—disproving false ones. It uses symbolic mutation to create training data for disproof, a critical step for automated formal verification and synthesis.
From the abstract
Mathematical reasoning demands two critical, complementary skills: constructing rigorous proofs for true statements and discovering counterexamples that disprove false ones. However, current AI efforts in mathematics focus almost exclusively on proof construction, often neglecting the equally important task of finding counterexamples. In this paper, we address this gap by fine-tuning large language models (LLMs) to reason about and generate counterexamples. We formalize this task as formal count