{"ID":2863740,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.25296","arxiv_id":"2509.25296","title":"Learning Relationships Between Separate Audio Tracks for Creative Applications","abstract":"This paper presents the first step in a research project situated within the field of musical agents. The objective is to achieve, through training, the tuning of the desired musical relationship between a live musical input and a real-time generated musical output, through the curation of a database of separated tracks. We propose an architecture integrating a symbolic decision module capable of learning and exploiting musical relationships from such musical corpus. We detail an offline implementation of this architecture employing Transformers as the decision module, associated with a perception module based on Wav2Vec 2.0, and concatenative synthesis as audio renderer. We present a quantitative evaluation of the decision module's ability to reproduce learned relationships extracted during training. We demonstrate that our decision module can predict a coherent track B when conditioned by its corresponding ''guide'' track A, based on a corpus of paired tracks (A, B).","short_abstract":"This paper presents the first step in a research project situated within the field of musical agents. The objective is to achieve, through training, the tuning of the desired musical relationship between a live musical input and a real-time generated musical output, through the curation of a database of separated track...","url_abs":"https://arxiv.org/abs/2509.25296","url_pdf":"https://arxiv.org/pdf/2509.25296v1","authors":"[\"Balthazar Bujard\",\"Jérôme Nika\",\"Fédéric Bevilacqua\",\"Nicolas Obin\"]","published":"2025-09-29T16:06:21Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.AI\",\"cs.HC\",\"cs.LG\",\"eess.AS\"]","methods":"[\"Transformer\"]","has_code":false}