Omnilingual MT scales machine translation to over 1,600 languages, an 8x increase in coverage over previous state-of-the-art systems.
arXiv · March 18, 2026 · 2603.16309
The Takeaway
This represents a massive leap in language coverage, particularly for under-supported languages, proving that LLM specialization can outperform 70B parameter models at much smaller scales (1B-8B). It provides a blueprint for supporting the 'long tail' of human languages previously ignored by ML.
From the abstract
High-quality machine translation (MT) can scale to hundreds of languages, setting a high bar for multilingual systems. However, compared to the world's 7,000 languages, current systems still offer only limited coverage: about 200 languages on the target side, and maybe a few hundreds more on the source side, supported due to cross-lingual transfer. And even these numbers have been hard to evaluate due to the lack of reliable benchmarks and metrics.We present Omnilingual Machine Translation (OMT)