A systematic critique explaining why 'self-improving' generative optimization loops fail in production and how to fix them.
March 26, 2026
Original Paper
Understanding the Challenges in Iterative Generative Optimization with LLMs
arXiv · 2603.23994
The Takeaway
While many papers claim agents can self-improve via feedback, only 9% of real-world agents do so successfully. This paper identifies 'hidden' design choices (credit horizon, batching) that determine success, providing a blueprint for making self-optimizing LLM systems actually work.
From the abstract
Generative optimization uses large language models (LLMs) to iteratively improve artifacts (such as code, workflows or prompts) using execution feedback. It is a promising approach to building self-improving agents, yet in practice remains brittle: despite active research, only 9% of surveyed agents used any automated optimization. We argue that this brittleness arises because, to set up a learning loop, an engineer must make ``hidden'' design choices: What can the optimizer edit and what is the