{"ID":2840181,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.14738","arxiv_id":"2511.14738","title":"LAUD: Integrating Large Language Models with Active Learning for Unlabeled Data","abstract":"Large language models (LLMs) have shown a remarkable ability to generalize beyond their pre-training data, and fine-tuning LLMs can elevate performance to human-level and beyond. However, in real-world scenarios, lacking labeled data often prevents practitioners from obtaining well-performing models, thereby forcing practitioners to highly rely on prompt-based approaches that are often tedious, inefficient, and driven by trial and error. To alleviate this issue of lacking labeled data, we present a learning framework integrating LLMs with active learning for unlabeled dataset (LAUD). LAUD mitigates the cold-start problem by constructing an initial label set with zero-shot learning. Experimental results show that LLMs derived from LAUD outperform LLMs with zero-shot or few-shot learning on commodity name classification tasks, demonstrating the effectiveness of LAUD.","short_abstract":"Large language models (LLMs) have shown a remarkable ability to generalize beyond their pre-training data, and fine-tuning LLMs can elevate performance to human-level and beyond. However, in real-world scenarios, lacking labeled data often prevents practitioners from obtaining well-performing models, thereby forcing pr...","url_abs":"https://arxiv.org/abs/2511.14738","url_pdf":"https://arxiv.org/pdf/2511.14738v1","authors":"[\"Tzu-Hsuan Chou\",\"Chun-Nan Chou\"]","published":"2025-11-18T18:31:00Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}