AI & ML Scaling Insight

Introduces the Neural Zeroth-order Kernel (NZK) to provide a theoretical foundation for training models without backpropagation.

March 24, 2026

Original Paper

Model Evolution Under Zeroth-Order Optimization: A Neural Tangent Kernel Perspective

Chen Zhang, Yuxin Cheng, Chenchen Ding, Shuqi Wang, Jingreng Lei, Runsheng Yu, Yik-Chung WU, Ngai Wong

arXiv · 2603.21169

The Takeaway

Zeroth-order optimization is key for memory-efficient training on edge devices. This paper provides the mathematical framework to understand how these models evolve, mirroring the impact of NTK on first-order deep learning research.

From the abstract

Zeroth-order (ZO) optimization enables memory-efficient training of neural networks by estimating gradients via forward passes only, eliminating the need for backpropagation. However, the stochastic nature of gradient estimation significantly obscures the training dynamics, in contrast to the well-characterized behavior of first-order methods under Neural Tangent Kernel (NTK) theory. To address this, we introduce the Neural Zeroth-order Kernel (NZK) to describe model evolution in function space

Read the original paper →

← Back to today's papers