{"ID":2889592,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.20731","arxiv_id":"2507.20731","title":"Learning Neural Vocoder from Range-Null Space Decomposition","abstract":"Despite the rapid development of neural vocoders in recent years, they usually suffer from some intrinsic challenges like opaque modeling, and parameter-performance trade-off. In this study, we propose an innovative time-frequency (T-F) domain-based neural vocoder to resolve the above-mentioned challenges. To be specific, we bridge the connection between the classical signal range-null decomposition (RND) theory and vocoder task, and the reconstruction of target spectrogram can be decomposed into the superimposition between the range-space and null-space, where the former is enabled by a linear domain shift from the original mel-scale domain to the target linear-scale domain, and the latter is instantiated via a learnable network for further spectral detail generation. Accordingly, we propose a novel dual-path framework, where the spectrum is hierarchically encoded/decoded, and the cross- and narrow-band modules are elaborately devised for efficient sub-band and sequential modeling. Comprehensive experiments are conducted on the LJSpeech and LibriTTS benchmarks. Quantitative and qualitative results show that while enjoying lightweight network parameters, the proposed approach yields state-of-the-art performance among existing advanced methods. Our code and the pretrained model weights are available at https://github.com/Andong-Li-speech/RNDVoC.","short_abstract":"Despite the rapid development of neural vocoders in recent years, they usually suffer from some intrinsic challenges like opaque modeling, and parameter-performance trade-off. In this study, we propose an innovative time-frequency (T-F) domain-based neural vocoder to resolve the above-mentioned challenges. To be specif...","url_abs":"https://arxiv.org/abs/2507.20731","url_pdf":"https://arxiv.org/pdf/2507.20731v1","authors":"[\"Andong Li\",\"Tong Lei\",\"Zhihang Sun\",\"Rilin Chen\",\"Erwei Yin\",\"Xiaodong Li\",\"Chengshi Zheng\"]","published":"2025-07-28T11:30:22Z","proceeding":"cs.SD","tasks":"[\"cs.SD\"]","methods":"[]","has_code":false,"code_links":[{"ID":611660,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2889592,"paper_url":"https://arxiv.org/abs/2507.20731","paper_title":"Learning Neural Vocoder from Range-Null Space Decomposition","repo_url":"https://github.com/Andong-Li-speech/RNDVoC","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
