{"ID":2874941,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.03131","arxiv_id":"2509.03131","title":"RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation","abstract":"Recent advances in LLM-based recommendation have shown promise, yet their cross-domain generalization is hindered by a fundamental mismatch between language-centric pretraining and the recommendation task. Existing methods, relying on language-level knowledge, fail to capture dynamic, item-level user interests across domains. To bridge this gap, we propose RecBase, a domain-agnostic foundational model pretrained with a recommendation-oriented objective. RecBase leverages a large-scale, heterogeneous, cross-domain corpus with unified textual representations and feature mappings to enhance cross-domain generalization. To further align item semantics across domains, we introduce a unified item tokenizer that encodes items into hierarchical concept identifiers, enabling structured representation and efficient vocabulary sharing. The model is trained using an autoregressive objective to capture complex item-level sequential patterns. On eight real-world datasets, our 1.5B-parameter model matches or surpasses the performance of LLM baselines up to 7B parameters in zero-shot and cross-domain recommendation tasks.","short_abstract":"Recent advances in LLM-based recommendation have shown promise, yet their cross-domain generalization is hindered by a fundamental mismatch between language-centric pretraining and the recommendation task. Existing methods, relying on language-level knowledge, fail to capture dynamic, item-level user interests across d...","url_abs":"https://arxiv.org/abs/2509.03131","url_pdf":"https://arxiv.org/pdf/2509.03131v1","authors":"[\"Sashuai Zhou\",\"Weinan Gan\",\"Qijiong Liu\",\"Ke Lei\",\"Jieming Zhu\",\"Hai Huang\",\"Yan Xia\",\"Ruiming Tang\",\"Zhenhua Dong\",\"Zhou Zhao\"]","published":"2025-09-03T08:33:43Z","proceeding":"cs.IR","tasks":"[\"cs.IR\",\"cs.LG\"]","methods":"[\"Large Language Model\"]","has_code":false}
