{"ID":2879445,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.18295","arxiv_id":"2508.18295","title":"H-PRM: A Pluggable Hotword Pre-Retrieval Module for Various Speech Recognition Systems","abstract":"Hotword customization is crucial in ASR to enhance the accuracy of domain-specific terms. It has been primarily driven by the advancements in traditional models and Audio large language models (LLMs). However, existing models often struggle with large-scale hotwords, as the recognition rate drops dramatically with the number of hotwords increasing. In this paper, we introduce a novel hotword customization system that utilizes a hotword pre-retrieval module (H-PRM) to identify the most relevant hotword candidate by measuring the acoustic similarity between the hotwords and the speech segment. This plug-and-play solution can be easily integrated into traditional models such as SeACo-Paraformer, significantly enhancing hotwords post-recall rate (PRR). Additionally, we incorporate H-PRM into Audio LLMs through a prompt-based approach, enabling seamless customization of hotwords. Extensive testing validates that H-PRM can outperform existing methods, showing a new direction for hotword customization in ASR.","short_abstract":"Hotword customization is crucial in ASR to enhance the accuracy of domain-specific terms. It has been primarily driven by the advancements in traditional models and Audio large language models (LLMs). However, existing models often struggle with large-scale hotwords, as the recognition rate drops dramatically with the...","url_abs":"https://arxiv.org/abs/2508.18295","url_pdf":"https://arxiv.org/pdf/2508.18295v1","authors":"[\"Huangyu Dai\",\"Lingtao Mao\",\"Ben Chen\",\"Zihan Wang\",\"Zihan Liang\",\"Ying Han\",\"Chenyi Lei\",\"Han Li\"]","published":"2025-08-22T13:30:22Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.AI\",\"cs.CL\",\"eess.AS\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
