GIDE enables precise, training-free image editing for discrete Diffusion LLMs by introducing a novel Discrete Noise Inversion mechanism.
March 24, 2026
Original Paper
GIDE: Unlocking Diffusion LLMs for Precise Training-Free Image Editing
arXiv · 2603.21176
The Takeaway
Previous editing techniques struggled with the discrete tokenization of multi-modal LLMs. GIDE allows users to perform high-fidelity edits (text/point/box) while strictly preserving the background, significantly outperforming prior training-free methods.
From the abstract
While Diffusion Large Language Models (DLLMs) have demonstrated remarkable capabilities in multi-modal generation, performing precise, training-free image editing remains an open challenge. Unlike continuous diffusion models, the discrete tokenization inherent in DLLMs hinders the application of standard noise inversion techniques, often leading to structural degradation during editing. In this paper, we introduce GIDE (Grounded Inversion for DLLM Image Editing), a unified framework designed to