{"ID":2832997,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.04793","arxiv_id":"2512.04793","title":"YingMusic-SVC: Real-World Robust Zero-Shot Singing Voice Conversion with Flow-GRPO and Singing-Specific Inductive Biases","abstract":"Singing voice conversion (SVC) aims to render the target singer's timbre while preserving melody and lyrics. However, existing zero-shot SVC systems remain fragile in real songs due to harmony interference, F0 errors, and the lack of inductive biases for singing. We propose YingMusic-SVC, a robust zero-shot framework that unifies continuous pre-training, robust supervised fine-tuning, and Flow-GRPO reinforcement learning. Our model introduces a singing-trained RVC timbre shifter for timbre-content disentanglement, an F0-aware timbre adaptor for dynamic vocal expression, and an energy-balanced rectified flow matching loss to enhance high-frequency fidelity. Experiments on a graded multi-track benchmark show that YingMusic-SVC achieves consistent improvements over strong open-source baselines in timbre similarity, intelligibility, and perceptual naturalness, especially under accompanied and harmony-contaminated conditions, demonstrating its effectiveness for real-world SVC deployment.","short_abstract":"Singing voice conversion (SVC) aims to render the target singer's timbre while preserving melody and lyrics. However, existing zero-shot SVC systems remain fragile in real songs due to harmony interference, F0 errors, and the lack of inductive biases for singing. We propose YingMusic-SVC, a robust zero-shot framework t...","url_abs":"https://arxiv.org/abs/2512.04793","url_pdf":"https://arxiv.org/pdf/2512.04793v1","authors":"[\"Gongyu Chen\",\"Xiaoyu Zhang\",\"Zhenqiang Weng\",\"Junjie Zheng\",\"Da Shen\",\"Chaofan Ding\",\"Wei-Qiang Zhang\",\"Zihao Chen\"]","published":"2025-12-04T13:38:50Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.AI\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}
