AI & ML Breaks Assumption

The SOMP attack demonstrates that private training text can be reconstructed from shared gradients even at high batch sizes (up to B=128).

arXiv · March 18, 2026 · 2603.16761

Yibo Li, Qiongxiu Li

The Takeaway

It breaks the common assumption that gradient aggregation in federated or distributed learning provides a sufficient privacy buffer for LLMs. This high-fidelity sparse recovery attack suggests that current gradient-sharing protocols are significantly more vulnerable to privacy leakage than previously estimated.

From the abstract

Gradient inversion attacks reveal that private training text can be reconstructed from shared gradients, posing a privacy risk to large language models (LLMs). While prior methods perform well in small-batch settings, scaling to larger batch sizes and longer sequences remains challenging due to severe signal mixing, high computational cost, and degraded fidelity. We present SOMP (Subspace-Guided Orthogonal Matching Pursuit), a scalable gradient inversion framework that casts text recovery from a

Read the original paper →

← Back to today's papers