AI & ML Paradigm Shift

This paper introduces a graph tokenization framework that allows standard Transformers like BERT to beat specialized Graph Neural Networks without any architectural changes.

arXiv · March 13, 2026 · 2603.11099

Zeyuan Guo, Enmao Diao, Cheng Yang, Chuan Shi

Why it matters

By combining reversible graph serialization with BPE, the authors treat graphs as sequences of substructure tokens. This enables the direct application of the entire LLM ecosystem (pre-training, scaling, and fine-tuning) to graph data, achieving SOTA results on 14 benchmarks and bridging the gap between GNNs and Transformers.

From the abstract

The success of large pretrained Transformers is closely tied to tokenizers, which convert raw input into discrete symbols. Extending these models to graph-structured data remains a significant challenge. In this work, we introduce a graph tokenization framework that generates sequential representations of graphs by combining reversible graph serialization, which preserves graph information, with Byte Pair Encoding (BPE), a widely adopted tokenizer in large language models (LLMs). To better captu