{"ID":2896840,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.05609","arxiv_id":"2507.05609","title":"MMW: Side Talk Rejection Multi-Microphone Whisper on Smart Glasses","abstract":"Smart glasses are increasingly positioned as the next-generation interface for ubiquitous access to large language models (LLMs). Nevertheless, achieving reliable interaction in real-world noisy environments remains a major challenge, particularly due to interference from side speech. In this work, we introduce a novel side-talk rejection multi-microphone Whisper (MMW) framework for smart glasses, incorporating three key innovations. First, we propose a Mix Block based on a Tri-Mamba architecture to effectively fuse multi-channel audio at the raw waveform level, while maintaining compatibility with streaming processing. Second, we design a Frame Diarization Mamba Layer to enhance frame-level side-talk suppression, facilitating more efficient fine-tuning of Whisper models. Third, we employ a Multi-Scale Group Relative Policy Optimization (GRPO) strategy to jointly optimize frame-level and utterance-level side speech suppression. Experimental evaluations demonstrate that the proposed MMW system can reduce the word error rate (WER) by 4.95\\% in noisy conditions.","short_abstract":"Smart glasses are increasingly positioned as the next-generation interface for ubiquitous access to large language models (LLMs). Nevertheless, achieving reliable interaction in real-world noisy environments remains a major challenge, particularly due to interference from side speech. In this work, we introduce a novel...","url_abs":"https://arxiv.org/abs/2507.05609","url_pdf":"https://arxiv.org/pdf/2507.05609v1","authors":"[\"Yang Liu\",\"Li Wan\",\"Yiteng Huang\",\"Yong Xu\",\"yangyang shi\",\"Saurabh Adya\",\"ming sun\",\"Florian Metze\"]","published":"2025-07-08T02:37:20Z","proceeding":"eess.AS","tasks":"[\"eess.AS\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
