{"ID":2871412,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.11369","arxiv_id":"2509.11369","title":"Decoding Musical Origins: Distinguishing Human and AI Composers","abstract":"With the rapid advancement of Large Language Models (LLMs), AI-driven music generation has become a vibrant and fruitful area of research. However, the representation of musical data remains a significant challenge. To address this, a novel, machine-learning-friendly music notation system, YNote, was developed. This study leverages YNote to train an effective classification model capable of distinguishing whether a piece of music was composed by a human (Native), a rule-based algorithm (Algorithm Generated), or an LLM (LLM Generated). We frame this as a text classification problem, applying the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm to extract structural features from YNote sequences and using the Synthetic Minority Over-sampling Technique (SMOTE) to address data imbalance. The resulting model achieves an accuracy of 98.25%, successfully demonstrating that YNote retains sufficient stylistic information for analysis. More importantly, the model can identify the unique \" technological fingerprints \" left by different AI generation techniques, providing a powerful tool for tracing the origins of AI-generated content.","short_abstract":"With the rapid advancement of Large Language Models (LLMs), AI-driven music generation has become a vibrant and fruitful area of research. However, the representation of musical data remains a significant challenge. To address this, a novel, machine-learning-friendly music notation system, YNote, was developed. This st...","url_abs":"https://arxiv.org/abs/2509.11369","url_pdf":"https://arxiv.org/pdf/2509.11369v1","authors":"[\"Cheng-Yang Tsai\",\"Tzu-Wei Huang\",\"Shao-Yu Wei\",\"Guan-Wei Chen\",\"Hung-Ying Chu\",\"Yu-Cheng Lin\"]","published":"2025-09-14T17:50:33Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
