AI & ML Breaks Assumption

Reveals that models with identical predictive performance produce fundamentally different feature attributions based solely on their hypothesis class.

arXiv · March 18, 2026 · 2603.15821

Thackshanaramana B

The Takeaway

This 'Explanation Lottery' challenges the reliability of XAI for auditing and regulation, showing that model selection is not explanation-neutral. It introduces a diagnostic score to identify when explanations are actually stable across different architectures.

From the abstract

The assumption that prediction-equivalent models produce equivalent explanations underlies many practices in explainable AI, including model selection, auditing, and regulatory evaluation. In this work, we show that this assumption does not hold. Through a large-scale empirical study across 24 datasets and multiple model classes, we find that models with identical predictive behavior can produce substantially different feature attributions. This disagreement is highly structured: models within t

Read the original paper →

← Back to today's papers