A bilevel framework where an outer LLM loop meta-optimizes an inner autoresearch loop by autonomously generating and injecting Python code at runtime.
March 25, 2026
Original Paper
Bilevel Autoresearch: Meta-Autoresearching Itself
arXiv · 2603.23420
The Takeaway
This is a step toward recursive self-improvement in AI research. The system autonomously discovered complex mechanisms like multi-armed bandits and DOE to improve its own search efficiency, achieving a 5x gain on pretraining benchmarks without human intervention.
From the abstract
If autoresearch is itself a form of research, then autoresearch can be applied to research itself. We take this idea literally: we use an autoresearch loop to optimize the autoresearch loop. Every existing autoresearch system -- from Karpathy's single-track loop to AutoResearchClaw's multi-batch extension and EvoScientist's persistent memory -- was improved by a human who read the code, identified a bottleneck, and wrote new code. We ask whether an LLM can do the same, autonomously. We present B