{"ID":2874642,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.04215","arxiv_id":"2509.04215","title":"PianoBind: A Multimodal Joint Embedding Model for Pop-piano Music","abstract":"Solo piano music, despite being a single-instrument medium, possesses significant expressive capabilities, conveying rich semantic information across genres, moods, and styles. However, current general-purpose music representation models, predominantly trained on large-scale datasets, often struggle to captures subtle semantic distinctions within homogeneous solo piano music. Furthermore, existing piano-specific representation models are typically unimodal, failing to capture the inherently multimodal nature of piano music, expressed through audio, symbolic, and textual modalities. To address these limitations, we propose PianoBind, a piano-specific multimodal joint embedding model. We systematically investigate strategies for multi-source training and modality utilization within a joint embedding framework optimized for capturing fine-grained semantic distinctions in (1) small-scale and (2) homogeneous piano datasets. Our experimental results demonstrate that PianoBind learns multimodal representations that effectively capture subtle nuances of piano music, achieving superior text-to-music retrieval performance on in-domain and out-of-domain piano datasets compared to general-purpose music joint embedding models. Moreover, our design choices offer reusable insights for multimodal representation learning with homogeneous datasets beyond piano music.","short_abstract":"Solo piano music, despite being a single-instrument medium, possesses significant expressive capabilities, conveying rich semantic information across genres, moods, and styles. However, current general-purpose music representation models, predominantly trained on large-scale datasets, often struggle to captures subtle...","url_abs":"https://arxiv.org/abs/2509.04215","url_pdf":"https://arxiv.org/pdf/2509.04215v1","authors":"[\"Hayeon Bang\",\"Eunjin Choi\",\"Seungheon Doh\",\"Juhan Nam\"]","published":"2025-09-04T13:43:53Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.IR\",\"cs.MM\"]","methods":"[]","has_code":false}