AI & ML Efficiency Breakthrough

NASimJax provides a 100x throughput increase for autonomous penetration testing simulators by reimplementing the environment in JAX.

March 23, 2026

Original Paper

NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing

Raphael Simon, José Carrasquel, Wim Mees, Pieter Libin

arXiv · 2603.19864

The Takeaway

Reinforcement learning for cybersecurity is bottlenecked by slow, CPU-bound simulators. Moving the entire training pipeline and environment to hardware accelerators enables the training of generalized agents on realistic, large-scale network topologies previously considered computationally infeasible.

From the abstract

Penetration testing, the practice of simulating cyberattacks to identify vulnerabilities, is a complex sequential decision-making task that is inherently partially observable and features large action spaces. Training reinforcement learning (RL) policies for this domain faces a fundamental bottleneck: existing simulators are too slow to train on realistic network scenarios at scale, resulting in policies that fail to generalize. We present NASimJax, a complete JAX-based reimplementation of the N