Generates novel, structurally plausible protein sequences from small alignments using a training-free stochastic attention mechanism on a standard laptop.
March 17, 2026
Original Paper
Training-Free Generation of Protein Sequences from Small Family Alignments via Stochastic Attention
arXiv · 2603.14717
The Takeaway
Democratizes protein design for the thousands of protein families with fewer than 100 members. It bypasses the need for massive pretraining datasets or GPUs by treating the modern Hopfield energy as a Boltzmann distribution for sampling.
From the abstract
Most protein families have fewer than 100 known members, a regime where deep generative models overfit or collapse. We propose stochastic attention (SA), a training-free sampler that treats the modern Hopfield energy over a protein alignment as a Boltzmann distribution and draws samples via Langevin dynamics. The score function is a closed-form softmax attention operation requiring no training, no pretraining data, and no GPU, with cost linear in alignment size. Across eight Pfam families, SA ge