{"ID":3006174,"CreatedAt":"2026-06-03T03:09:48.883664427Z","UpdatedAt":"2026-06-04T19:14:31.964469513Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.03028","arxiv_id":"2606.03028","title":"Audio Spotforming via Post-Filtering Using Cross-Array Non-target Estimates","abstract":"Audio spotforming is a technique for extracting target speech from noisy mixtures by utilizing multiple microphone arrays. Conventional methods estimate a shared target speech component from linearly separated signals obtained by each array using low-rank approximations and apply post filtering (PF) based on this estimated low-rank representation. However, owing to the mismatch between low-rank models and the complex structure of speech signals, directly relying on low-rank approximations for PF can degrade the speech extraction performance. In this study, we leverage the observation that non-target components located in the target speech direction from the perspective of one array can be spatially separated when viewed from other arrays. This insight motivates a new spotforming method for efficient post-filter estimation using non-target estimates across arrays instead of relying on low-rank approximations. Experiments demonstrate that the proposed method outperforms conventional spotforming methods.","short_abstract":"Audio spotforming is a technique for extracting target speech from noisy mixtures by utilizing multiple microphone arrays. Conventional methods estimate a shared target speech component from linearly separated signals obtained by each array using low-rank approximations and apply post filtering (PF) based on this estim...","url_abs":"https://arxiv.org/abs/2606.03028","url_pdf":"https://arxiv.org/pdf/2606.03028v1","authors":"[\"Yuto Ishikawa\",\"Li Li\",\"Shogo Seki\",\"Kouei Yamaoka\"]","published":"2026-06-02T02:07:29Z","proceeding":"cs.SD","tasks":"[\"cs.SD\"]","methods":"[]","has_code":false}
