ICPRL enables vision-language models to acquire physical intuition and adapt their policies in-context through trial-and-error interaction.
March 17, 2026
Original Paper
ICPRL: Acquiring Physical Intuition from Interactive Control
arXiv · 2603.13295
The Takeaway
It moves physical reasoning from static perception to active, vision-grounded reinforcement learning that doesn't require weight updates. This allows robots or agents to adapt to novel physical puzzles and environments purely through interaction history.
From the abstract
VLMs excel at static perception but falter in interactive reasoning in dynamic physical environments, which demands planning and adaptation to dynamic outcomes. Existing physical reasoning methods often depend on abstract symbolic inputs or lack the ability to learn and adapt from direct, pixel-based visual interaction in novel scenarios. We introduce ICPRL (In-Context Physical Reinforcement Learning), a framework inspired by In-Context Reinforcement Learning (ICRL) that empowers VLMs to acquire