A unified L0-gating mechanism that enables comparable sparsification and pruning across graphs, text, and tabular data.
March 31, 2026
Original Paper
Sparse-by-Design Cross-Modality Prediction: L0-Gated Representations for Reliable and Efficient Learning
arXiv · 2603.26801
The Takeaway
Instead of separate pruning techniques for different architectures, L0GM provides a modality-agnostic primitive for feature-wise gating. This allows practitioners to apply a single efficiency-accuracy trade-off 'knob' across heterogeneous end-to-end pipelines (GNNs, Transformers, and MLPs).
From the abstract
Predictive systems increasingly span heterogeneous modalities such as graphs, language, and tabular records, but sparsity and efficiency remain modality-specific (graph edge or neighborhood sparsification, Transformer head or layer pruning, and separate tabular feature-selection pipelines). This fragmentation makes results hard to compare, complicates deployment, and weakens reliability analysis across end-to-end KDD pipelines. A unified sparsification primitive would make accuracy-efficiency tr