AI & ML Paradigm Challenge

LLMs aren't 'visualizing' the mazes they solve; they are just following tokenized directions that fall apart if the layout format changes.

April 14, 2026

Original Paper

Do LLMs Build Spatial World Models? Evidence from Grid-World Maze Tasks

Weijiang Li, Yilin Zhu, Rajarshi Das, Parijat Dube

arXiv · 2604.10690

The Takeaway

AI maze-solving ability collapses when switching from adjacency lists to visual grids. This proves LLMs lack a robust internal spatial model and instead rely on sophisticated pattern matching based on specific data formatting.

From the abstract

Foundation models have shown remarkable performance across diverse tasks, yet their ability to construct internal spatial world models for reasoning and planning remains unclear. We systematically evaluate the spatial understanding of large language models through maze tasks, a controlled testing context requiring multi-step planning and spatial abstraction. Across comprehensive experiments with Gemini-2.5-Flash, GPT-5-mini, Claude-Haiku-4.5, and DeepSeek-Chat, we uncover significant discrepanci

Read the original paper →

← Back to today's papers