{"ID":2843335,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.08128","arxiv_id":"2511.08128","title":"Sentence-Anchored Gist Compression for Long-Context LLMs","abstract":"This work investigates context compression for Large Language Models (LLMs) using learned compression tokens to reduce the memory and computational demands of processing long sequences. We demonstrate that pre-trained LLMs can be fine-tuned to compress their context by factors of 2x to 8x without significant performance degradation, as evaluated on both short-context and long-context benchmarks. Furthermore, in experiments on a 3-billion-parameter LLaMA model, our method achieves results on par with alternative compression techniques while attaining higher compression ratios.","short_abstract":"This work investigates context compression for Large Language Models (LLMs) using learned compression tokens to reduce the memory and computational demands of processing long sequences. We demonstrate that pre-trained LLMs can be fine-tuned to compress their context by factors of 2x to 8x without significant performanc...","url_abs":"https://arxiv.org/abs/2511.08128","url_pdf":"https://arxiv.org/pdf/2511.08128v1","authors":"[\"Dmitrii Tarasov\",\"Elizaveta Goncharova\",\"Kuznetsov Andrey\"]","published":"2025-11-11T11:34:32Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}