AI & ML New Capability

DebugLM allows developers to trace an LLM's specific behaviors back to individual training data sources.

arXiv · March 19, 2026 · 2603.17884

Wenjie Jacky Mo, Qin Liu, Xiaofei Wen, Wenxuan Zhou, Zhe Zhao, Muhao Chen

The Takeaway

By embedding provenance tags into the model's training, it enables precise debugging and targeted test-time remediation (like selective refusal) without retraining. This solves a major observability gap in the multi-stage pipelines used to build foundation models.

From the abstract

Large language models (LLMs) are trained through multi-stage pipelines over heterogeneous data sources, yet developers lack a principled way to pinpoint the specific data responsible for an observed behavior. This lack of observability reduces debugging to reactive patching and makes failures prone to recur under distribution shift or subsequent model updates. To address this limitation, we propose DebugLM, a framework that equips LLMs with built-in data provenance, enabling them to explicitly t

Read the original paper →

← Back to today's papers