AI & ML Efficiency Breakthrough

Introduces PolarQuant, a quantization method that uses Hadamard rotation to make LLM weights near-lossless at 5-bit without calibration data.

April 1, 2026

Original Paper

PolarQuant: Optimal Gaussian Weight Quantization via Hadamard Rotation for LLM Compression

Caio Vicentino

arXiv · 2603.29078

The Takeaway

It identifies that Hadamard rotation accounts for nearly all quality gains in Gaussian quantization, allowing for high-throughput, low-memory inference (43 tok/s) with minimal perplexity degradation. It is a highly practical 'drop-in' technique for model compression.

From the abstract

We present PolarQuant, a post-training weight quantization method for large language models (LLMs) that exploits the distributional structure of neural network weights to achieve near-lossless compression. PolarQuant operates in three stages: (1) block-wise normalization to the unit hypersphere, (2) Walsh-Hadamard rotation to transform coordinates into approximately Gaussian random variables, and (3) quantization with centroids matched to the Gaussian distribution. Our ablation reveals that Hada