Cross-domain sensor model that handles variable signal lengths and resolutions without retraining.
arXiv · March 13, 2026 · 2603.11950
Why it matters
SLIP aligns multivariate time-series data with natural language through cross-attention to a frozen LLM decoder. It allows for zero-shot signal captioning and reasoning across diverse sensor configurations, solving the rigidity of previous SSL-based sensor models.
From the abstract
Modern sensing systems generate large volumes of unlabeled multivariate time-series data. This abundance of unlabeled data makes self-supervised learning (SSL) a natural approach for learning transferable representations. However, most existing approaches are optimized for reconstruction or forecasting objectives and often fail to capture the semantic structure required for downstream classification and reasoning tasks. While recent sensor-language alignment methods improve semantic generalizati