Offline Decision Transformers can now synthesize strategies that surpass the classical heuristics they were trained on for the Traveling Salesman Problem.
March 27, 2026
Original Paper
Offline Decision Transformers for Neural Combinatorial Optimization: Surpassing Heuristics on the Traveling Salesman Problem
arXiv · 2603.25241
The Takeaway
It demonstrates that neural combinatorial optimization can leverage existing domain knowledge (heuristics) to learn superior, generalized strategies. This moves neural solvers from 'imitating' algorithms to 'improving' them through offline RL.
From the abstract
Combinatorial optimization problems like the Traveling Salesman Problem are critical in industry yet NP-hard. Neural Combinatorial Optimization has shown promise, but its reliance on online reinforcement learning (RL) hampers deployment and underutilizes decades of algorithmic knowledge. We address these limitations by applying the offline RL framework, Decision Transformer, to learn superior strategies directly from datasets of heuristic solutions; it aims to not only to imitate but to synthesi