{"ID":2885587,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.05688","arxiv_id":"2508.05688","title":"LLM4ES: Learning User Embeddings from Event Sequences via Large Language Models","abstract":"This paper presents LLM4ES, a novel framework that exploits large pre-trained language models (LLMs) to derive user embeddings from event sequences. Event sequences are transformed into a textual representation, which is subsequently used to fine-tune an LLM through next-token prediction to generate high-quality embeddings. We introduce a text enrichment technique that enhances LLM adaptation to event sequence data, improving representation quality for low-variability domains. Experimental results demonstrate that LLM4ES achieves state-of-the-art performance in user classification tasks in financial and other domains, outperforming existing embedding methods. The resulting user embeddings can be incorporated into a wide range of applications, from user segmentation in finance to patient outcome prediction in healthcare.","short_abstract":"This paper presents LLM4ES, a novel framework that exploits large pre-trained language models (LLMs) to derive user embeddings from event sequences. Event sequences are transformed into a textual representation, which is subsequently used to fine-tune an LLM through next-token prediction to generate high-quality embedd...","url_abs":"https://arxiv.org/abs/2508.05688","url_pdf":"https://arxiv.org/pdf/2508.05688v2","authors":"[\"Aleksei Shestov\",\"Omar Zoloev\",\"Maksim Makarenko\",\"Mikhail Orlov\",\"Egor Fadeev\",\"Ivan Kireev\",\"Andrey Savchenko\"]","published":"2025-08-06T06:54:06Z","proceeding":"cs.IR","tasks":"[\"cs.IR\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}