AI & ML Efficiency Breakthrough

Reduces reaction latency in flow-based VLA models by 10x, enabling real-time responsiveness on consumer GPUs.

March 20, 2026

Original Paper

FASTER: Rethinking Real-Time Flow VLAs

Yuxiang Lu, Zhe Liu, Xianzhe Fan, Zhenya Yang, Jinghua Hou, Junyi Li, Kaixin Ding, Hengshuang Zhao

arXiv · 2603.19199

The Takeaway

Identifies that standard action-chunking schedules are the bottleneck for robot reaction time and introduces a Horizon-Aware Schedule that prioritizes immediate actions. This enables generalist policies to perform highly dynamic tasks like table tennis that were previously too latency-sensitive for large VLAs.

From the abstract

Real-time execution is crucial for deploying Vision-Language-Action (VLA) models in the physical world. Existing asynchronous inference methods primarily optimize trajectory smoothness, but neglect the critical latency in reacting to environmental changes. By rethinking the notion of reaction in action chunking policies, this paper presents a systematic analysis of the factors governing reaction time. We show that reaction time follows a uniform distribution determined jointly by the Time to Fir