Structured distillation for personalized agent memory achieves an 11x reduction in token count while preserving 96% of the retrieval quality of verbatim history.
arXiv · March 16, 2026 · 2603.13017
Why it matters
Solves the 'long context' cost problem for AI agents by distilling chat history into compact, structured objects; this allows thousands of previous exchanges to fit into a single prompt context window without sacrificing the agent's ability to recall specific details.
From the abstract
Long conversations with an AI agent create a simple problem for one user: the history is useful, but carrying it verbatim is expensive. We study personalized agent memory: one user's conversation history with an agent, distilled into a compact retrieval layer for later search. Each exchange is compressed into a compound object with four fields (exchange_core, specific_context, thematic room_assignments, and regex-extracted files_touched). The searchable distilled text averages 38 tokens per exch