AI & ML Open Release

Introduces MeteoCap-3B, a billion-scale meteorological dataset with expert captions and a spectral-aware diffusion model for weather time-series generation.

March 31, 2026

Original Paper

Spectral-Aware Text-to-Time Series Generation with Billion-Scale Multimodal Meteorological Data

Shijie Zhang

arXiv · 2603.27135

The Takeaway

It democratizes high-fidelity meteorological modeling by releasing a massive dataset and a model that outperforms existing benchmarks in zero-shot forecasting and semantic control. This is a significant jump in scale for the intersection of multimodal LLMs and physical sciences.

From the abstract

Text-to-time-series generation is particularly important in meteorology, where natural language offers intuitive control over complex, multi-scale atmospheric dynamics. Existing approaches are constrained by the lack of large-scale, physically grounded multimodal datasets and by architectures that overlook the spectral-temporal structure of weather signals. We address these challenges with a unified framework for text-guided meteorological time-series generation. First, we introduce MeteoCap-3B,