AI & ML Paradigm Shift

Measuring the distance between human languages can now be done quantitatively using the attention mechanisms of multilingual transformers.

March 19, 2026

Original Paper

Pretrained Multilingual Transformers Reveal Quantitative Distance Between Human Languages

Yue Zhao, Jiatao Gu, Paloma Jeretič, Weijie Su

arXiv · 2603.17912

The Takeaway

The paper introduces Attention Transport Distance (ATD), a tokenization-agnostic measure that recovers linguistic relationships from model activations. It provides a principled way to optimize transfer learning and data selection for low-resource machine translation.

From the abstract

Understanding the distance between human languages is central to linguistics, anthropology, and tracing human evolutionary history. Yet, while linguistics has long provided rich qualitative accounts of cross-linguistic variation, a unified and scalable quantitative approach to measuring language distance remains lacking. In this paper, we introduce a method that leverages pretrained multilingual language models as systematic instruments for linguistic measurement. Specifically, we show that the