AI & ML Paradigm Shift

Shifts 3D scene generation from diffusion to a fully autoregressive paradigm using next-token prediction of 3D Gaussian primitives.

March 30, 2026

Original Paper

GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

Nicolas von Lützow, Barbara Rössle, Katharina Schmid, Matthias Nießner

arXiv · 2603.26661

The Takeaway

By treating 3D scenes as sequences of tokens, it enables LLM-style capabilities like partial scene completion and outpainting while remaining compatible with real-time neural rendering pipelines.

From the abstract

Most recent advances in 3D generative modeling rely on diffusion or flow-matching formulations. We instead explore a fully autoregressive alternative and introduce GaussianGPT, a transformer-based model that directly generates 3D Gaussians via next-token prediction, thus facilitating full 3D scene generation. We first compress Gaussian primitives into a discrete latent grid using a sparse 3D convolutional autoencoder with vector quantization. The resulting tokens are serialized and modeled using