{"ID":2873901,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.05585","arxiv_id":"2509.05585","title":"Natural Language-Programming Language Software Traceability Link Recovery Needs More than Textual Similarity","abstract":"In the field of software traceability link recovery (TLR), textual similarity has long been regarded as the core criterion. However, in tasks involving natural language and programming language (NL-PL) artifacts, relying solely on textual similarity is limited by their semantic gap. To this end, we conducted a large-scale empirical evaluation across various types of TLR tasks, revealing the limitations of textual similarity in NL-PL scenarios. To address these limitations, we propose an approach that incorporates multiple domain-specific auxiliary strategies, identified through empirical analysis, into two models: the Heterogeneous Graph Transformer (HGT) via edge types and the prompt-based Gemini 2.5 Pro via additional input information. We then evaluated our approach using the widely studied requirements-to-code TLR task, a representative case of NL-PL TLR. Experimental results show that both the multi-strategy HGT and Gemini 2.5 Pro models outperformed their original counterparts without strategy integration. Furthermore, compared to the current state-of-the-art method HGNNLink, the multi-strategy HGT and Gemini 2.5 Pro models achieved average F1-score improvements of 3.68% and 8.84%, respectively, across twelve open-source projects, demonstrating the effectiveness of multi-strategy integration in enhancing overall model performance for the requirements-code TLR task.","short_abstract":"In the field of software traceability link recovery (TLR), textual similarity has long been regarded as the core criterion. However, in tasks involving natural language and programming language (NL-PL) artifacts, relying solely on textual similarity is limited by their semantic gap. To this end, we conducted a large-sc...","url_abs":"https://arxiv.org/abs/2509.05585","url_pdf":"https://arxiv.org/pdf/2509.05585v1","authors":"[\"Zhiyuan Zou\",\"Bangchao Wang\",\"Peng Liang\",\"Tingting Bi\",\"Huan Jin\"]","published":"2025-09-06T04:15:09Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.AI\"]","methods":"[\"Transformer\",\"Graph Neural Network\"]","has_code":false}