AI & ML Breaks Assumption

Distilled VAE encoders are found to perform significantly better on higher, unseen resolutions than on their native training resolution.

arXiv · March 17, 2026 · 2603.14536

Jiaming Chu, Tao Wang, Lei Jin

The Takeaway

This challenges the conventional wisdom that performance degrades on out-of-distribution high resolutions. Practitioners can use 'resolution remapping' to gain substantial quality improvements in generative models by upsampling inputs before encoding, even if the model was never trained for it.

From the abstract

Variational Autoencoder (VAE) encoders play a critical role in modern generative models, yet their computational cost often motivates the use of knowledge distillation or quantification to obtain compact alternatives. Existing studies typically believe that the model work better on the samples closed to their training data distribution than unseen data distribution. In this work, we report a counter-intuitive phenomenon in VAE encoder distillation: a compact encoder distilled only at low resolut