SeriesFusion
Science, curated & edited by AI
Nature Is Weird  /  AI

There is a hard physical ceiling on what images can tell us about our environment.

Despite AI hype, Vision-Language Models fail to predict outdoor thermal comfort because the necessary meteorological variables are literally unobservable in pixels. This is a definitive 'negative result' against the assumption that more data equals better sensing.

Original Paper

From Pixels to UTCI: A Zero-Shot Framework for Predicting Outdoor Thermal Comfort from Street View Images Using Vision-Language Models

Luo Yanhuo

SSRN  ·  6570820

Street-level prediction of the Universal Thermal Climate Index (UTCI) typically requires microclimate simulations or dense sensor networks, both costly and difficult to scale. This study proposes a zero-shot framework in which Vision-Language Models (VLMs) predict UTCI directly from street-view images without task-specific training. Two VLMs — Gemini 2.5 Flash (commercial) and LLaVA 1.6 7B (open-source) — are evaluated across four cities in three Köppen climate zones (600 images, 10,800 inferenc