AI & ML New Capability

A novel approach to upcycle multiple dense expert models into a unified Mixture-of-Experts model without any additional training.

April 1, 2026

Original Paper

Training-Free Dynamic Upcycling of Expert Language Models

Eros Fanì, Oğuzhan Ersoy

arXiv · 2603.29765

The Takeaway

Leveraging a closed-form ridge regression solution, it enables the dynamic addition of experts into a single multitask model while eliminating the cost of fine-tuning and the risk of catastrophic forgetting.

From the abstract

Large Language Models (LLMs) have achieved remarkable performance on a wide range of specialized tasks, exhibiting strong problem-solving capabilities. However, training these models is prohibitively expensive, and they often lack domain-specific expertise because they rely on general knowledge datasets. Expertise finetuning can address this issue; however, it often leads to overspecialization, and developing a single multi-domain expert remains difficult due to diverging objectives. Furthermore

Read the original paper →

← Back to today's papers