Introduces MeteoCap-3B, a billion-scale meteorological dataset with expert captions and a spectral-aware diffusion model for weather time-series generation.
March 31, 2026
Original Paper
Spectral-Aware Text-to-Time Series Generation with Billion-Scale Multimodal Meteorological Data
arXiv · 2603.27135
The Takeaway
It democratizes high-fidelity meteorological modeling by releasing a massive dataset and a model that outperforms existing benchmarks in zero-shot forecasting and semantic control. This is a significant jump in scale for the intersection of multimodal LLMs and physical sciences.
From the abstract
Text-to-time-series generation is particularly important in meteorology, where natural language offers intuitive control over complex, multi-scale atmospheric dynamics. Existing approaches are constrained by the lack of large-scale, physically grounded multimodal datasets and by architectures that overlook the spectral-temporal structure of weather signals. We address these challenges with a unified framework for text-guided meteorological time-series generation. First, we introduce MeteoCap-3B,