AI & ML Scaling Insight

Theoretical analysis reveals that the efficiency benefits of low-dimensional data structures for diffusion models diminish significantly when the data manifold is non-linear.

March 25, 2026

Original Paper

Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data

Anand Jerry George, Nicolas Macris

arXiv · 2603.22962

The Takeaway

This challenges the assumption that diffusion models always scale efficiently with intrinsic data dimensionality. It provides a more nuanced understanding of sample complexity for generative models, suggesting that architectural or algorithmic changes are needed to exploit non-linear structures effectively.

From the abstract

We study the theoretical behavior of denoising score matching--the learning task associated to diffusion models--when the data distribution is supported on a low-dimensional manifold and the score is parameterized using a random feature neural network. We derive asymptotically exact expressions for the test, train, and score errors in the high-dimensional limit. Our analysis reveals that, for linear manifolds the sample complexity required to learn the score function scales linearly with the int