Publishers have a new trick: they can hide invisible 'traps' in their work that make it legally impossible for AI to learn from them.
March 27, 2026
Original Paper
Compound Statutory Liability Entrapment in Inference-Time AI Pipelines
SSRN · 6432898
The Takeaway
While most legal battles against AI focus on copyright, this paper identifies a technical 'trap' where the automated cleaning of HTML—a necessary step for AI training—forces the AI to intentionally remove licensing metadata. This structural 'entanglement' transforms a simple web-crawling task into a violation of the DMCA.
From the abstract
As Artificial Intelligence shifts from static training ingestion to real-time, inferencebased retrieval, the economic harm to primary web publishers has accelerated. Current legal frameworks focus heavily on 17 U.S.C. § 501 (infringement of the exclusive rights granted under § 106), which is frequently obfuscated by "Fair Use" defenses. That debate is already old. The industry has moved on. This paper introduces a novel enforcement paradigm utilizing 17 U.S.C. § 1202 (Integrity of Copyright Mana