AI & ML New Capability

Fine-tuning language models on journal publication records allows them to match or exceed human experts in judging 'scientific taste'—the ability to identify which research ideas are worth pursuing.

arXiv · March 18, 2026 · 2603.16659

Ziqin Gong, Ning Li, Huaikang Zhou

The Takeaway

The paper demonstrates that subjective evaluative judgment can be extracted from institutional traces, achieving 59% accuracy compared to 42% for human expert panels. This suggests a path toward automated triage for the massive volume of research and grant applications that currently overwhelms human reviewers.

From the abstract

Artificial intelligence matches or exceeds human performance on tasks with verifiable answers, from protein folding to Olympiad mathematics. Yet the capacity that most governs scientific advance is not reasoning but taste: the ability to judge which untested ideas deserve pursuit, exercised daily by editors and funders but never successfully articulated, taught, or automated. Here we show that fine-tuning language models on journal publication decisions recovers evaluative judgment inaccessible

Read the original paper →

← Back to today's papers