{"ID":2857138,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.10243","arxiv_id":"2510.10243","title":"Efficient Mining of Low-Utility Sequential Patterns","abstract":"Discovering valuable insights from rich data is a crucial task for exploratory data analysis. Sequential pattern mining (SPM) has found widespread applications across various domains. In recent years, low-utility sequential pattern mining (LUSPM) has shown strong potential in applications such as intrusion detection and genomic sequence analysis. However, existing research in utility-based SPM focuses on high-utility sequential patterns, and the definitions and strategies used in high-utility SPM cannot be directly applied to LUSPM. Moreover, no algorithms have yet been developed specifically for mining low-utility sequential patterns. To address these problems, we formalize the LUSPM problem, redefine sequence utility, and introduce a compact data structure called the sequence-utility chain to efficiently record utility information. Furthermore, we propose three novel algorithm--LUSPM_b, LUSPM_s, and LUSPM_e--to discover the complete set of low-utility sequential patterns. LUSPM_b serves as an exhaustive baseline, while LUSPM_s and LUSPM_e build upon it, generating subsequences through shrinkage and extension operations, respectively. In addition, we introduce the maximal non-mutually contained sequence set and incorporate multiple pruning strategies, which significantly reduce redundant operations in both LUSPM_s and LUSPM_e. Finally, extensive experimental results demonstrate that both LUSPM_s and LUSPM_e substantially outperform LUSPM_b and exhibit excellent scalability. Notably, LUSPM_e achieves superior efficiency, requiring less runtime and memory consumption than LUSPM_s. Our code is available at https://github.com/Zhidong-Lin/LUSPM.","short_abstract":"Discovering valuable insights from rich data is a crucial task for exploratory data analysis. Sequential pattern mining (SPM) has found widespread applications across various domains. In recent years, low-utility sequential pattern mining (LUSPM) has shown strong potential in applications such as intrusion detection an...","url_abs":"https://arxiv.org/abs/2510.10243","url_pdf":"https://arxiv.org/pdf/2510.10243v2","authors":"[\"Jian Zhu\",\"Zhidong Lin\",\"Wensheng Gan\",\"Philip S. Yu\"]","published":"2025-10-11T14:52:04Z","proceeding":"cs.DB","tasks":"[\"cs.DB\"]","methods":"[\"LoRA\"]","has_code":false,"code_links":[{"ID":608419,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2857138,"paper_url":"https://arxiv.org/abs/2510.10243","paper_title":"Efficient Mining of Low-Utility Sequential Patterns","repo_url":"https://github.com/Zhidong-Lin/LUSPM","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
