BadGraph demonstrates that LLMs can generate universal adversarial attacks that exploit vulnerabilities in both GNN and PLM architectures on graph data.
March 24, 2026
Original Paper
Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs
arXiv · 2603.21155
The Takeaway
It proves that structural and textual perturbations can be jointly optimized by LLMs to break graph-based learning systems. This highlights a significant security risk for text-attributed graphs in applications like fraud detection or social network analysis.
From the abstract
Text-attributed graphs (TAGs) enhance graph learning by integrating rich textual semantics and topological context for each node. While boosting expressiveness, they also expose new vulnerabilities in graph learning through text-based adversarial surfaces. Recent advances leverage diverse backbones, such as graph neural networks (GNNs) and pre-trained language models (PLMs), to capture both structural and textual information in TAGs. This diversity raises a key question: How can we design univer