Measuring the distance between human languages can now be done quantitatively using the attention mechanisms of multilingual transformers.
March 19, 2026
Original Paper
Pretrained Multilingual Transformers Reveal Quantitative Distance Between Human Languages
arXiv · 2603.17912
The Takeaway
The paper introduces Attention Transport Distance (ATD), a tokenization-agnostic measure that recovers linguistic relationships from model activations. It provides a principled way to optimize transfer learning and data selection for low-resource machine translation.
From the abstract
Understanding the distance between human languages is central to linguistics, anthropology, and tracing human evolutionary history. Yet, while linguistics has long provided rich qualitative accounts of cross-linguistic variation, a unified and scalable quantitative approach to measuring language distance remains lacking. In this paper, we introduce a method that leverages pretrained multilingual language models as systematic instruments for linguistic measurement. Specifically, we show that the