COvolve creates an automated curriculum for open-ended learning by co-evolving environments and policies as executable code through a zero-sum game.
March 31, 2026
Original Paper
COvolve: Adversarial Co-Evolution of Large-Language-Model-Generated Policies and Environments via Two-Player Zero-Sum Game
arXiv · 2603.28386
The Takeaway
By using LLMs to generate both the task and the solution, then solving for the Nash Equilibrium to prevent forgetting, this framework moves beyond static training sets. It enables agents to continually improve and generalize without human-designed task distributions.
From the abstract
A central challenge in building continually improving agents is that training environments are typically static or manually constructed. This restricts continual learning and generalization beyond the training distribution. We address this with COvolve, a co-evolutionary framework that leverages large language models (LLMs) to generate both environments and agent policies, expressed as executable Python code. We model the interaction between environment and policy designers as a two-player zero-