AI & ML Efficiency Breakthrough

IF4 introduces an adaptive 4-bit data type that switches between Float and Integer representations to minimize quantization error.

March 31, 2026

Original Paper

Adaptive Block-Scaled Data Types

Jack Cook, Hyemin S. Lee, Kathryn Le, Junxian Guo, Giovanni Traverso, Anantha P. Chandrakasan, Song Han

arXiv · 2603.28765

The Takeaway

By utilizing an unused bit in the scale factor to choose the optimal distribution for each block of values, IF4 significantly outperforms NVFP4. This provides a clear path for next-generation hardware to support higher-fidelity 4-bit quantization for LLMs.

From the abstract

NVFP4 has grown increasingly popular as a 4-bit format for quantizing large language models due to its hardware support and its ability to retain useful information with relatively few bits per parameter. However, the format is not without limitations: recent work has shown that NVFP4 suffers from its error distribution, resulting in large amounts of quantization error on near-maximal values in each group of 16 values. In this work, we leverage this insight to design new Adaptive Block-Scaled Da

Read the original paper →

← Back to today's papers