AI & ML Scaling Insight

Cyber-attack capabilities of AI models scale log-linearly with inference-time compute, with no plateau in sight.

arXiv · March 13, 2026 · 2603.11214

Linus Folkerts, Will Payne, Simon Inman, Philippos Giavridis, Joe Skinner, Sam Deverett, James Aung, Ekin Zorer, Michael Schmatz, Mahmoud Ghanem, John Wilkinson, Alan Steer, Vy Hong, Jessica Wang

Why it matters

It demonstrates that increasing inference-time compute (up to 100M tokens) can improve success rates by 59% in complex, multi-step cyber attacks. This suggests that safety evaluations based on fixed token budgets may significantly underestimate the risk of near-future models.

From the abstract

We evaluate the autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges-a 32-step corporate network attack and a 7-step industrial control system attack-that require chaining heterogeneous capabilities across extended action sequences. By comparing seven models released over an eighteen-month period (August 2024 to February 2026) at varying inference-time compute budgets, we observe two capability trends. First, model performance scales log-linearly with infe

Read the original paper →

← Back to today's papers