AI & ML Paradigm Shift

Aligns a base model to a target model's behavior by optimizing the 'data mixture' weights instead of using RLHF or DPO.

March 18, 2026

Original Paper

Domain Mixture Design via Log-Likelihood Differences for Aligning Language Models with a Target Model

Ryo Kishino, Riku Shiomi, Hiroaki Yamagiwa, Momose Oyama, Hidetoshi Shimodaira

arXiv · 2603.16622

The Takeaway

Rather than fine-tuning on specific outputs, this method calculates the gradient direction toward a target distribution (like GPT-4) and re-weights the pretraining data to match it. It suggests that 'alignment' can be built into the data recipe itself rather than added as a post-training patch.

From the abstract

Instead of directly distilling a language model, this study addresses the problem of aligning a base model with a target model in distribution by designing the domain mixture of training data for pretraining or continued pretraining as a fixed training recipe. We propose a method for determining domain weights by viewing models as points in log-likelihood space and aligning the training update direction with the direction toward the target model. Experiments with NanoGPT show that the proposed m

Read the original paper →

← Back to today's papers