Top-b sampling introduces entropy-aware adaptive bandwidth for LLM decoding, effectively approximating a self-regulating control system for generation.
March 17, 2026
Original Paper
Top-b: Entropic Regulation of Relative Probability Bands in Autoregressive Language Processes
arXiv · 2603.14567
The Takeaway
It replaces static truncation (Top-k/p) with a dynamic bandwidth that tightens for logical reasoning and expands for creative tasks. This significantly reduces inter-decoding variance and 'tail noise' in autoregressive generation.
From the abstract
Probabilistic language generators are theoretically modeled as discrete stochastic processes, yet standard decoding strategies (Top-k, Top-p) impose static truncation rules that fail to accommodate the dynamic information density of natural language. This misalignment often forces a suboptimal trade-off: static bounds are either too restrictive for high-entropy creative generation or too permissive for low-entropy logical reasoning. In this work, we formalize the generation process as a trajecto