Introduces the Neural Zeroth-order Kernel (NZK) to provide a theoretical foundation for training models without backpropagation.
March 24, 2026
Original Paper
Model Evolution Under Zeroth-Order Optimization: A Neural Tangent Kernel Perspective
arXiv · 2603.21169
The Takeaway
Zeroth-order optimization is key for memory-efficient training on edge devices. This paper provides the mathematical framework to understand how these models evolve, mirroring the impact of NTK on first-order deep learning research.
From the abstract
Zeroth-order (ZO) optimization enables memory-efficient training of neural networks by estimating gradients via forward passes only, eliminating the need for backpropagation. However, the stochastic nature of gradient estimation significantly obscures the training dynamics, in contrast to the well-characterized behavior of first-order methods under Neural Tangent Kernel (NTK) theory. To address this, we introduce the Neural Zeroth-order Kernel (NZK) to describe model evolution in function space