AI & ML Open Release

Releases an offline search-and-browse pipeline with 97K long-horizon trajectories for training 'Deep Research' agents.

March 24, 2026

Original Paper

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Zhuofeng Li, Dongfu Jiang, Xueguang Ma, Haoxiang Zhang, Ping Nie, Yuyu Zhang, Kai Zou, Jianwen Xie, Yu Zhang, Wenhu Chen

arXiv · 2603.20278

The Takeaway

Democratizes the development of agents like OpenAI's 'Operator' by providing a fully instrumented, reproducible environment and a massive dataset of multi-turn reasoning and tool-use trajectories.

From the abstract

Training deep research agents requires long-horizon trajectories that interleave search, evidence aggregation, and multi-step reasoning. However, existing data collection pipelines typically rely on proprietary web APIs, making large-scale trajectory synthesis costly, unstable, and difficult to reproduce. We present OpenResearcher, a reproducible pipeline that decouples one-time corpus bootstrapping from multi-turn trajectory synthesis and executes the search-and-browse loop entirely offline usi