AI & ML Scaling Insight

Simple Self-Distillation (SSD) improves LLM code generation (e.g., Qwen3-30B) by 13% Pass@1 without any external verifiers or teacher models.

April 2, 2026

Original Paper

Embarrassingly Simple Self-Distillation Improves Code Generation

Ruixiang Zhang, Richard He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang

arXiv · 2604.01193

The Takeaway

It shows that LLMs can significantly improve their own coding logic just by fine-tuning on their own high-temperature outputs, suggesting that 'data' for post-training can be effectively bootstrapped without expensive RL or human labeling.

From the abstract

Can a large language model (LLM) improve at code generation using only its own raw outputs, without a verifier, a teacher model, or reinforcement learning? We answer in the affirmative with simple self-distillation (SSD): sample solutions from the model with certain temperature and truncation configurations, then fine-tune on those samples with standard supervised fine-tuning. SSD improves Qwen3-30B-Instruct from 42.4% to 55.3% pass@1 on LiveCodeBench v6, with gains concentrating on harder probl

Read the original paper →

← Back to today's papers