AI & ML Efficiency Breakthrough

FineRMoE extends MoE granularity to both intermediate and output dimensions, achieving a 136x increase in decoding throughput.

March 17, 2026

Original Paper

FineRMoE: Dimension Expansion for Finer-Grained Expert with Its Upcycling Approach

Ning Liao, Xiaoxing Wang, Xiaohan Qin, Junchi Yan

arXiv · 2603.13364

The Takeaway

It overcomes the performance plateau of traditional fine-grained MoEs by expanding the 'expert' dimensionality. The proposed upcycling method allows researchers to convert existing models into this high-efficiency architecture without starting training from scratch.

From the abstract

As revealed by the scaling law of fine-grained MoE, model performance ceases to be improved once the granularity of the intermediate dimension exceeds the optimal threshold, limiting further gains from single-dimension fine-grained design. To address this bottleneck, we propose FineRMoE (FineR-Grained MoE), an architecture that extends fine-grained expert design to both intermediate and output dimensions, aiming to enhance expert specialization beyond the single-dimension limit. We further intro