{"ID":2864917,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.21739","arxiv_id":"2509.21739","title":"Noise-to-Notes: Diffusion-based Generation and Refinement for Automatic Drum Transcription","abstract":"Automatic drum transcription (ADT) is traditionally formulated as a discriminative task to predict drum events from audio spectrograms. In this work, we redefine ADT as a conditional generative task and introduce Noise-to-Notes (N2N), a framework leveraging diffusion modeling to transform audio-conditioned Gaussian noise into drum events with associated velocities. This generative diffusion approach offers distinct advantages, including a flexible speed-accuracy trade-off and strong inpainting capabilities. However, the generation of binary onset and continuous velocity values presents a challenge for diffusion models, and to overcome this, we introduce an Annealed Pseudo-Huber loss to facilitate effective joint optimization. Finally, to augment low-level spectrogram features, we propose incorporating features extracted from music foundation models (MFMs), which capture high-level semantic information and enhance robustness to out-of-domain drum audio. Experimental results demonstrate that including MFM features significantly improves robustness and N2N establishes a new state-of-the-art performance across multiple ADT benchmarks.","short_abstract":"Automatic drum transcription (ADT) is traditionally formulated as a discriminative task to predict drum events from audio spectrograms. In this work, we redefine ADT as a conditional generative task and introduce Noise-to-Notes (N2N), a framework leveraging diffusion modeling to transform audio-conditioned Gaussian noi...","url_abs":"https://arxiv.org/abs/2509.21739","url_pdf":"https://arxiv.org/pdf/2509.21739v2","authors":"[\"Michael Yeung\",\"Keisuke Toyama\",\"Toya Teramoto\",\"Shusuke Takahashi\",\"Tamaki Kojima\"]","published":"2025-09-26T01:12:43Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.LG\",\"eess.AS\"]","methods":"[\"Diffusion Model\"]","has_code":false}
