CurveStream implements a curvature-aware hierarchical memory to handle streaming video in MLLMs without Out-of-Memory (OOM) errors.
March 23, 2026
Original Paper
CurveStream: Boosting Streaming Video Understanding in MLLMs via Curvature-Aware Hierarchical Visual Memory Management
arXiv · 2603.19571
The Takeaway
Most video LLMs fail on long streams due to linear token growth. This training-free method uses geometric curvature to identify critical semantic transitions, allowing models to maintain long-range context by intelligently pruning redundant frames in real-time.
From the abstract
Multimodal Large Language Models have achieved significant success in offline video understanding, yet their application to streaming videos is severely limited by the linear explosion of visual tokens, which often leads to Out-of-Memory (OOM) errors or catastrophic forgetting. Existing visual retention and memory management methods typically rely on uniform sampling, low-level physical metrics, or passive cache eviction. However, these strategies often lack intrinsic semantic awareness, potenti