AI & ML Open Release

Democratizes the development of 'Deep Search' agents by open-sourcing the specialized training data and trajectory synthesis methods.

arXiv · March 17, 2026 · 2603.15594

Yuwen Du, Rui Ye, Shuo Tang, Xinyu Zhu, Yijun Lu, Yuzhu Cai, Siheng Chen

The Takeaway

Deep search capabilities (like Perplexity or OpenAI Search) have been proprietary moats. By releasing the data and models that achieve state-of-the-art results with only 11k samples, it allows researchers to build and iterate on high-performance search agents.

From the abstract

Deep search capabilities have become an indispensable competency for frontier Large Language Model (LLM) agents, yet the development of high-performance search agents remains dominated by industrial giants due to a lack of transparent, high-quality training data. This persistent data scarcity has fundamentally hindered the progress of the broader research community in developing and innovating within this domain. To bridge this gap, we introduce OpenSeeker, the first fully open-source search age