AI & ML New Capability

Prism prevents 'diversity collapse' in self-evolving reasoning systems by using semantic partitioning to guide the generation of new problems.

arXiv · March 17, 2026 · 2603.13309

Vaibhav Mishra

The Takeaway

Self-improving models like o1 often struggle with curriculum stagnation. Prism ensures a balanced exploration of the problem space, leading to significant accuracy gains (+4 points on math benchmarks) and the creation of high-quality synthetic datasets.

From the abstract

Self-evolving reasoning frameworks let LLMs improve their reasoning capabilities by iteratively generating and solving problems without external supervision, using verifiable rewards. Ideally, such systems are expected to explore a diverse problem space and propose new challenges of high learning value. While prior work has largely focused on solver-side optimisation and verification, recent evidence suggests that self-evolving systems can exhibit diversity collapse in posing new problems after