AI & ML Paradigm Shift

COvolve creates an automated curriculum for open-ended learning by co-evolving environments and policies as executable code through a zero-sum game.

March 31, 2026

Original Paper

COvolve: Adversarial Co-Evolution of Large-Language-Model-Generated Policies and Environments via Two-Player Zero-Sum Game

Alkis Sygkounas, Rishi Hazra, Andreas Persson, Pedro Zuidberg Dos Martires, Amy Loutfi

arXiv · 2603.28386

The Takeaway

By using LLMs to generate both the task and the solution, then solving for the Nash Equilibrium to prevent forgetting, this framework moves beyond static training sets. It enables agents to continually improve and generalize without human-designed task distributions.

From the abstract

A central challenge in building continually improving agents is that training environments are typically static or manually constructed. This restricts continual learning and generalization beyond the training distribution. We address this with COvolve, a co-evolutionary framework that leverages large language models (LLMs) to generate both environments and agent policies, expressed as executable Python code. We model the interaction between environment and policy designers as a two-player zero-