AI & ML Practical Magic

You can now slash the cost of repetitive web automation from $150 down to 10 cents by 'compiling' LLM reasoning into JSON.

April 15, 2026

Original Paper

Agentic Compilation: Mitigating the LLM Rerun Crisis for Minimized-Inference-Cost Web Automation

arXiv · 2604.09718

The Takeaway

Agentic Compilation introduces a 'Compile-and-Execute' architecture that turns expensive, real-time LLM reasoning into a deterministic blueprint. For repetitive tasks, this amortizes the inference cost from O(M x N) to O(1), making high-frequency web scraping and automation economically viable. Before this, agents were too expensive for most production use cases because they 're-thought' the same problem every time. Now, once the agent learns a path, it executes it with near-zero cost. This allows developers to build industrial-grade agents that are both reliable and affordable.

From the abstract

LLM-driven web agents operating through continuous inference loops -- repeatedly querying a model to evaluate browser state and select actions -- exhibit a fundamental scalability constraint for repetitive tasks. We characterize this as the Rerun Crisis: the linear growth of token expenditure and API latency relative to execution frequency. For a 5-step workflow over 500 iterations, a continuous agent incurs approximately 150.00 USD in inference costs; even with aggressive caching, this remains