Contrastive Association Learning (CAL) successfully recovers functional gene associations from expression data where standard similarity metrics fail.
March 24, 2026
Original Paper
Beyond Expression Similarity: Contrastive Learning Recovers Functional Gene Associations from Protein Interaction Structure
arXiv · 2603.20955
The Takeaway
Enables the discovery of protein-protein interactions and functional links using gene perturbation data with an AUC of 0.9. This demonstrates that 'co-occurrence' training from NLP translates effectively to biological systems, outperforming traditional expression-similarity methods.
From the abstract
The Predictive Associative Memory (PAM) framework posits that useful relationships often connect items that co-occur in shared contexts rather than items that appear similar in embedding space. A contrastive MLP trained on co-occurrence annotations--Contrastive Association Learning (CAL)--has improved multi-hop passage retrieval and discovered narrative function at corpus scale in text. We test whether this principle transfers to molecular biology, where protein-protein interactions provide func