{"ID":2875763,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.01184","arxiv_id":"2509.01184","title":"MARS: Modality-Aligned Retrieval for Sequence Augmented CTR Prediction","abstract":"Click-through rate (CTR) prediction serves as a cornerstone of recommender systems. Despite the strong performance of current CTR models based on user behavior modeling, they are still severely limited by interaction sparsity, especially in low-active user scenarios. To address this issue, data augmentation of user behavior is a promising research direction. However, existing data augmentation methods heavily rely on collaborative signals while overlooking the rich multimodal features of items, leading to insufficient modeling of low-active users. To alleviate this problem, we propose a novel framework \\textbf{MARS} (\\textbf{M}odality-\\textbf{A}ligned \\textbf{R}etrieval for \\textbf{S}equence Augmented CTR Prediction). MARS utilizes a Stein kernel-based approach to align text and image features into a unified and unbiased semantic space to construct multimodal user embeddings. Subsequently, each low-active user's behavior sequence is augmented by retrieving, filtering, and concentrating the most similar behavior sequence of high-active users via multimodal user embeddings. Validated by extensive offline experiments and online A/B tests, our framework MARS consistently outperforms state-of-the-art baselines and achieves substantial growth on core business metrics within Kuaishou~\\footnote{https://www.kuaishou.com/}. Consequently, MARS has been successfully deployed, serving the main traffic for hundreds of millions of users. To ensure reproducibility, we provide anonymous access to the implementation code~\\footnote{https://github.com/wangshukuan/MARS}.","short_abstract":"Click-through rate (CTR) prediction serves as a cornerstone of recommender systems. Despite the strong performance of current CTR models based on user behavior modeling, they are still severely limited by interaction sparsity, especially in low-active user scenarios. To address this issue, data augmentation of user beh...","url_abs":"https://arxiv.org/abs/2509.01184","url_pdf":"https://arxiv.org/pdf/2509.01184v1","authors":"[\"Yutian Xiao\",\"Shukuan Wang\",\"Binhao Wang\",\"Zhao Zhang\",\"Yanze Zhang\",\"Shanqi Liu\",\"Chao Feng\",\"Xiang Li\",\"Fuzhen Zhuang\"]","published":"2025-09-01T07:08:44Z","proceeding":"cs.IR","tasks":"[\"cs.IR\"]","methods":"[]","project_urls":"[\"https://www.kuaishou.com/\"]","has_code":false,"code_links":[{"ID":610237,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2875763,"paper_url":"https://arxiv.org/abs/2509.01184","paper_title":"MARS: Modality-Aligned Retrieval for Sequence Augmented CTR Prediction","repo_url":"https://github.com/wangshukuan/MARS","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
