Reveals that diffusion models overfit at intermediate noise levels that standard evaluation metrics typically ignore.
arXiv · March 17, 2026 · 2603.13419
The Takeaway
The paper challenges the assumption that large diffusion models generalize uniformly. It shows that memorization is noise-level dependent and that denoising trajectories during inference often avoid the specific regions where training data is memorized, providing a new lens for evaluating model privacy and robustness.
From the abstract
Standard evaluation metrics suggest that Denoising Diffusion Models based on U-Net or Transformer architectures generalize well in practice. However, as it can be shown that an optimal Diffusion Model fully memorizes the training data, the model error determines generalization. Here, we show that although sufficiently large denoiser models show increasing memorization of the training set with increasing training time, the resulting denoising trajectories do not follow this trend. Our experiments