Replaces traditional fixed-update rules in online learning with a causal Transformer to track switching experts in non-stationary environments.
March 31, 2026
Original Paper
Policy-Controlled Generalized Share: A General Framework with a Transformer Instantiation for Strictly Online Switching-Oracle Tracking
arXiv · 2603.28198
The Takeaway
It demonstrates that the classic 'switching-oracle' problem can be solved more effectively by treating the weight update as a learnable policy controlled by a Transformer. This hybrid approach significantly outperforms traditional dynamic-programming methods in complex, non-stationary data streams.
From the abstract
Static regret to a single expert is often the wrong target for strictly online prediction under non-stationarity, where the best expert may switch repeatedly over time. We study Policy-Controlled Generalized Share (PCGS), a general strictly online framework in which the generalized-share recursion is fixed while the post-loss update controls are allowed to vary adaptively. Its principal instantiation in this paper is PCGS-TF, which uses a causal Transformer as an update controller: after round t