AI & ML New Capability

WorldMesh generates consistent, large-scale 3D worlds by populating a geometric mesh scaffold with image diffusion-derived content.

March 25, 2026

Original Paper

WorldMesh: Generating Navigable Multi-Room 3D Scenes via Mesh-Conditioned Image Diffusion

Manuel-Andreas Schneider, Angela Dai

arXiv · 2603.22972

The Takeaway

Existing 3D generation often fails at scale due to geometric drift; this 'geometry-first' approach ensures multi-room consistency and photorealism. It enables the creation of navigable, environment-scale immersive worlds that were previously difficult to generate consistently from text.

From the abstract

Recent progress in image and video synthesis has inspired their use in advancing 3D scene generation. However, we observe that text-to-image and -video approaches struggle to maintain scene- and object-level consistency beyond a limited environment scale due to the absence of explicit geometry. We thus present a geometry-first approach that decouples this complex problem of large-scale 3D scene synthesis into its structural composition, represented as a mesh scaffold, and realistic appearance sy

Read the original paper →

← Back to today's papers