{"ID":2863136,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.00316","arxiv_id":"2510.00316","title":"Large Language Models Can Perform Automatic Modulation Classification via Discretized Self-supervised Candidate Retrieval","abstract":"Identifying wireless modulation schemes is essential for cognitive radio, but standard supervised models often degrade under distribution shift, and training domain-specific wireless foundation models from scratch is computationally prohibitive. Large Language Models (LLMs) offer a promising training-free alternative via in-context learning, yet feeding raw floating-point signal statistics into LLMs overwhelms models with numerical noise and exhausts token budgets. We introduce DiSC-AMC, a framework that reformulates Automatic Modulation Classification (AMC) as an LLM reasoning task by combining aggressive feature discretization with nearest-neighbor retrieval over self-supervised embeddings. By mapping continuous features to coarse symbolic tokens, DiSC-AMC aligns abstract signal patterns with LLM reasoning capabilities and reduces prompt length by over $50$\\%. Simultaneously, utilizing a DINOv2 visual encoder to retrieve the $k_\\text{NN}$ most similar labeled exemplars provides highly relevant, query-specific context rather than generic class averages. On a 10-class benchmark, a fine-tuned 7B-parameter LLM using DiSC-AMC achieves $83.0$\\% in-distribution accuracy ($-10$\\,to\\,$+10$\\,dB) and $82.50$\\% out-of-distribution (OOD) accuracy ($-11$\\,to\\,$-15$\\,dB), outperforming supervised baselines. Comprehensive ablations on vanilla LLMs demonstrate the token efficiency of DiSC-AMC. A training-free $7$B LLM achieves $71$\\% accuracy using only $0.5$\\,K-token prompt,surpassing a $200$B-parameter baseline that relies on a $2.9$K-token prompt. Furthermore, similarity-based exemplar retrieval outperforms naive class-average selection by over $20$\\%. Finally, we identify a fundamental limitation of this pipeline. At extreme OOD noise levels ($-30$\\,dB), the underlying self-supervised representations collapse, degrading retrieval quality and reducing classification to random chance.","short_abstract":"Identifying wireless modulation schemes is essential for cognitive radio, but standard supervised models often degrade under distribution shift, and training domain-specific wireless foundation models from scratch is computationally prohibitive. Large Language Models (LLMs) offer a promising training-free alternative v...","url_abs":"https://arxiv.org/abs/2510.00316","url_pdf":"https://arxiv.org/pdf/2510.00316v2","authors":"[\"Mohammad Rostami\",\"Atik Faysal\",\"Reihaneh Gh. Roshan\",\"Huaxia Wang\",\"Nikhil Muralidhar\",\"Yu-Dong Yao\"]","published":"2025-09-30T22:20:57Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}