{"ID":2875328,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.02072","arxiv_id":"2509.02072","title":"Abex-rat: Synergizing Abstractive Augmentation and Adversarial Training for Classification of Occupational Accident Reports","abstract":"The automatic classification of occupational accident reports is pivotal for workplace safety analysis but is persistently hindered by severe class imbalance and data scarcity. In this paper, we propose ABEX-RAT, a resource-efficient framework that synergizes generative data augmentation with robust adversarial learning. Unlike computationally expensive large language models (LLMs) fine-tuning, our approach employs a two-stage abstractive-expansive (ABEX) pipeline: it first utilizes a prompt-guided LLM to distill label-critical semantics into concise abstracts, which are then expanded into diverse synthetic samples to balance the data distribution. Subsequently, we train a lightweight classifier using a random adversarial training (RAT) protocol, which stochastically injects perturbations to enhance generalization without significant computational overhead. Experimental results on the OSHA dataset demonstrate that ABEXRAT establishes a new state-of-the-art, achieving a Macro-F1 score of 90.32% and significantly outperforming both traditional baselines and fine-tuned large models. This confirms that targeted augmentation combined with robust training offers a superior, data-efficient alternative for specialized domain classification. The source code will be made publicly available upon acceptance.","short_abstract":"The automatic classification of occupational accident reports is pivotal for workplace safety analysis but is persistently hindered by severe class imbalance and data scarcity. In this paper, we propose ABEX-RAT, a resource-efficient framework that synergizes generative data augmentation with robust adversarial learnin...","url_abs":"https://arxiv.org/abs/2509.02072","url_pdf":"https://arxiv.org/pdf/2509.02072v4","authors":"[\"Jian Chen\",\"Jiabao Dou\"]","published":"2025-09-02T08:22:59Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.IR\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
