AI & ML Scaling Insight

Identifies 'label leakage' from limited task diversity as the primary bottleneck for relational foundation models, rather than raw data volume.

April 1, 2026

Original Paper

Task Scarcity and Label Leakage in Relational Transfer Learning

Francisco Galuppo Azevedo, Clarissa Lima Loures, Denis Oliveira Correa

arXiv · 2603.29914

The Takeaway

Practitioners building tabular or relational models often focus on adding more data; this work shows that without diversifying prediction targets and using specific gradient projection methods to suppress shortcuts, representations will fail to transfer.

From the abstract

Training relational foundation models requires learning representations that transfer across tasks, yet available supervision is typically limited to a small number of prediction targets per database. This task scarcity causes learned representations to encode task-specific shortcuts that degrade transfer even within the same schema, a problem we call label leakage. We study this using K-Space, a modular architecture combining frozen pretrained tabular encoders with a lightweight message-passing