OpenSanctions Pairs releases a massive benchmark for entity matching, proving that local LLMs can now match production rule-based systems in high-stakes compliance tasks.
arXiv · March 13, 2026 · 2603.11051
AI-generated illustration
Why it matters
This dataset of 750k+ labeled pairs provides a much-needed benchmark for entity resolution in finance and sanctions. It demonstrates that models like DeepSeek-R1-14B can achieve 98%+ F1, effectively making traditional rule-based and fuzzy-matching engines obsolete for enterprise entity matching.
From the abstract
We release OpenSanctions Pairs, a large-scale entity matching benchmark derived from real-world international sanctions aggregation and analyst deduplication. The dataset contains 755,540 labeled pairs spanning 293 heterogeneous sources across 31 countries, with multilingual and cross-script names, noisy and missing attributes, and set-valued fields typical of compliance workflows. We benchmark a production rule-based matcher (nomenklatura RegressionV1 algorithm) against open- and closed-source