{"ID":2859308,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.05840","arxiv_id":"2510.05840","title":"Multimodal Trajectory Representation Learning for Travel Time Estimation","abstract":"Accurate travel time estimation (TTE) plays a crucial role in intelligent transportation systems. However, it remains challenging due to heterogeneous data sources and complex traffic dynamics. Moreover, traditional approaches typically convert trajectory data into fixed-length representations. This overlooks the inherent variability of real-world motion patterns, often resulting in information loss and redundancy. To address these challenges, this paper introduces the Multimodal Dynamic Trajectory Integration (MDTI) framework--a novel multimodal trajectory representation learning approach that integrates GPS sequences, grid trajectories, and road network constraints to enhance the performance of TTE. MDTI employs modality-specific encoders and a multimodal fusion module to capture complementary spatial, temporal, and topological semantics, while a dynamic trajectory modeling mechanism adaptively regulates information density for trajectories of varying lengths. Two self-supervised pretraining objectives, named contrastive alignment and masked language modeling, further strengthen multimodal consistency and contextual understanding. Extensive experiments on three real-world datasets demonstrate that MDTI consistently outperforms state-of-the-art baselines, confirming its robustness and strong generalization abilities. The code is publicly available at: https://github.com/City-Computing/MDTI.","short_abstract":"Accurate travel time estimation (TTE) plays a crucial role in intelligent transportation systems. However, it remains challenging due to heterogeneous data sources and complex traffic dynamics. Moreover, traditional approaches typically convert trajectory data into fixed-length representations. This overlooks the inher...","url_abs":"https://arxiv.org/abs/2510.05840","url_pdf":"https://arxiv.org/pdf/2510.05840v2","authors":"[\"Zhi Liu\",\"Xuyuan Hu\",\"Xiao Han\",\"Zhehao Dai\",\"Zhaolin Deng\",\"Guojiang Shen\",\"Xiangjie Kong\"]","published":"2025-10-07T12:04:16Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Language Model\"]","has_code":false,"code_links":[{"ID":608627,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2859308,"paper_url":"https://arxiv.org/abs/2510.05840","paper_title":"Multimodal Trajectory Representation Learning for Travel Time Estimation","repo_url":"https://github.com/City-Computing/MDTI","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}