IF4 introduces an adaptive 4-bit data type that switches between Float and Integer representations to minimize quantization error.
March 31, 2026
Original Paper
Adaptive Block-Scaled Data Types
arXiv · 2603.28765
The Takeaway
By utilizing an unused bit in the scale factor to choose the optimal distribution for each block of values, IF4 significantly outperforms NVFP4. This provides a clear path for next-generation hardware to support higher-fidelity 4-bit quantization for LLMs.
From the abstract
NVFP4 has grown increasingly popular as a 4-bit format for quantizing large language models due to its hardware support and its ability to retain useful information with relatively few bits per parameter. However, the format is not without limitations: recent work has shown that NVFP4 suffers from its error distribution, resulting in large amounts of quantization error on near-maximal values in each group of 16 values. In this work, we leverage this insight to design new Adaptive Block-Scaled Da