Introduces 'Visual Chronometer' to estimate physical frame rates directly from visual dynamics, addressing the 'chronometric hallucinations' common in generative video models.
arXiv · March 17, 2026 · 2603.14375
The Takeaway
Current video generators lack a reliable internal clock, making them unsuitable for physical world models. This tool allows practitioners to ground and correct the temporal scale of generated videos, enabling more realistic motion and better alignment with real-world physics.
From the abstract
While recent generative video models have achieved remarkable visual realism and are being explored as world models, true physical simulation requires mastering both space and time. Current models can produce visually smooth kinematics, yet they lack a reliable internal motion pulse to ground these motions in a consistent, real-world time scale. This temporal ambiguity stems from the common practice of indiscriminately training on videos with vastly different real-world speeds, forcing them into