{"ID":2874574,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.04091","arxiv_id":"2509.04091","title":"Revisiting Third-Party Library Detection: A Ground Truth Dataset and Its Implications Across Security Tasks","abstract":"Accurate detection of third-party libraries (TPLs) is fundamental to Android security, supporting vulnerability tracking, malware detection, and supply chain auditing. Despite many proposed tools, their real-world effectiveness remains unclear. We present the first large-scale empirical study of ten state-of-the-art TPL detection techniques across over 6,000 apps, enabled by a new ground truth dataset with precise version-level annotations for both remote and local dependencies. Our evaluation exposes tool fragility to R8-era transformations, weak version discrimination, inaccurate correspondence of candidate libraries, difficulty in generalizing similarity thresholds, and prohibitive runtime/memory overheads at scale. Beyond tool assessment, we further analyze how TPLs shape downstream tasks, including vulnerability analysis, malware detection, secret leakage assessment, and LLM-based evaluation. From this perspective, our study provides concrete insights into how TPL characteristics affect these tasks and informs future improvements in security analysis.","short_abstract":"Accurate detection of third-party libraries (TPLs) is fundamental to Android security, supporting vulnerability tracking, malware detection, and supply chain auditing. Despite many proposed tools, their real-world effectiveness remains unclear. We present the first large-scale empirical study of ten state-of-the-art TP...","url_abs":"https://arxiv.org/abs/2509.04091","url_pdf":"https://arxiv.org/pdf/2509.04091v2","authors":"[\"Jintao Gu\",\"Haolang Lu\",\"Guoshun Nan\",\"Yihan Lin\",\"Kun Wang\",\"Yuchun Guo\",\"Yigui Cao\",\"Yang Liu\"]","published":"2025-09-04T10:48:02Z","proceeding":"cs.CR","tasks":"[\"cs.CR\"]","methods":"[\"Large Language Model\"]","has_code":false}
