The models we use for AI and genetics are 'ambiguous'—two totally different realities can look exactly the same to the math.
March 24, 2026
Original Paper
On the identifiability of Dirichlet mixture models
arXiv · 2603.21914
The Takeaway
This research proves that many common models are 'non-identifiable,' meaning multiple different underlying processes can produce the exact same data. It implies that in many scientific fields, it is logically impossible to know which theory is true based on the data alone.
From the abstract
We study identifiability of finite mixtures of Dirichlet distributions on the interior of the simplex. We first prove a shift identity showing that every Dirichlet density can be written as a mixture of $J$ shifted Dirichlet densities, where $J-1$ is the dimension of the simplex support, which yields non-identifiability on the full parameter space. We then show that identifiability is recovered on a fixed-total parameter slice and on restricted box-type regions. On the full parameter space, we p