AI & ML Open Release

Surg-R1 is a specialized surgical reasoning model released alongside the largest surgical Chain-of-Thought dataset (320,000 pairs).

arXiv · March 16, 2026 · 2603.12430

Jian Jiang, Chenxi Lin, Yiming Gu, Zengyi Qin, Zhitao Zeng, Kun Yuan, Yonghao Long, Xiang Xia, Cheng Yuan, Yuqi Wang, Zijie Yue, Kunyi Yang, Yuting Zhang, Zhu Zhuo, Dian Qin, Xin Wang, NG Chi Fai, Brian Anthony, Daguang Xu, Guy Rosman, Ozanan Meireles, Zizhen Zhang, Nicolas Padoy, Hesheng Wang, Qi Dou, Yueming Jin, Yutong Ban

Why it matters

It democratizes high-quality reasoning in a specialized domain where general-purpose models like GPT-4 often fail. The hierarchical reasoning framework and multi-center validation provide a blueprint for building domain-specific 'R1' models.

From the abstract

Surgical scene understanding demands not only accurate predictions but also interpretable reasoning that surgeons can verify against clinical expertise. However, existing surgical vision-language models generate predictions without reasoning chains, and general-purpose reasoning models fail on compositional surgical tasks without domain-specific knowledge. We present Surg-R1, a surgical Vision-Language Model that addresses this gap through hierarchical reasoning trained via a four-stage pipeline