{"ID":2874198,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.04985","arxiv_id":"2509.04985","title":"Training a Perceptual Model for Evaluating Auditory Similarity in Music Adversarial Attack","abstract":"Music Information Retrieval (MIR) systems are highly vulnerable to adversarial attacks that are often imperceptible to humans, primarily due to a misalignment between model feature spaces and human auditory perception. Existing defenses and perceptual metrics frequently fail to adequately capture these auditory nuances, a limitation supported by our initial listening tests showing low correlation between common metrics and human judgments. To bridge this gap, we introduce Perceptually-Aligned MERT Transformer (PAMT), a novel framework for learning robust, perceptually-aligned music representations. Our core innovation lies in the psychoacoustically-conditioned sequential contrastive transformer, a lightweight projection head built atop a frozen MERT encoder. PAMT achieves a Spearman correlation coefficient of 0.65 with subjective scores, outperforming existing perceptual metrics. Our approach also achieves an average of 9.15\\% improvement in robust accuracy on challenging MIR tasks, including Cover Song Identification and Music Genre Classification, under diverse perceptual adversarial attacks. This work pioneers architecturally-integrated psychoacoustic conditioning, yielding representations significantly more aligned with human perception and robust against music adversarial attacks.","short_abstract":"Music Information Retrieval (MIR) systems are highly vulnerable to adversarial attacks that are often imperceptible to humans, primarily due to a misalignment between model feature spaces and human auditory perception. Existing defenses and perceptual metrics frequently fail to adequately capture these auditory nuances...","url_abs":"https://arxiv.org/abs/2509.04985","url_pdf":"https://arxiv.org/pdf/2509.04985v1","authors":"[\"Yuxuan Liu\",\"Rui Sang\",\"Peihong Zhang\",\"Zhixin Li\",\"Shengchen Li\"]","published":"2025-09-05T10:23:44Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"eess.AS\"]","methods":"[\"Transformer\"]","has_code":false}
