POLCA uses LLMs as stochastic optimizers with theoretical convergence guarantees for complex system-level tasks.
March 17, 2026
Original Paper
POLCA: Stochastic Generative Optimization with LLM
arXiv · 2603.14769
The Takeaway
This framework automates the labor-intensive iterative refinement of LLM prompts and agent behaviors in noisy, stochastic environments. It treats the LLM as an optimizer that learns across historical trials, significantly outperforming standard human-in-the-loop tuning.
From the abstract
Optimizing complex systems, ranging from LLM prompts to multi-turn agents, traditionally requires labor-intensive manual iteration. We formalize this challenge as a stochastic generative optimization problem where a generative language model acts as the optimizer, guided by numerical rewards and text feedback to discover the best system. We introduce Prioritized Optimization with Local Contextual Aggregation (POLCA), a scalable framework designed to handle stochasticity in optimization -- such a