{"ID":2834817,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.00745","arxiv_id":"2512.00745","title":"FastPOS: Language-Agnostic Scalable POS Tagging Framework Low-Resource Use Case","abstract":"This study proposes a language-agnostic transformer-based POS tagging framework designed for low-resource languages, using Bangla and Hindi as case studies. With only three lines of framework-specific code, the model was adapted from Bangla to Hindi, demonstrating effective portability with minimal modification. The framework achieves 96.85 percent and 97 percent token-level accuracy across POS categories in Bangla and Hindi while sustaining strong F1 scores despite dataset imbalance and linguistic overlap. A performance discrepancy in a specific POS category underscores ongoing challenges in dataset curation. The strong results stem from the underlying transformer architecture, which can be replaced with limited code adjustments. Its modular and open-source design enables rapid cross-lingual adaptation while reducing model design and tuning overhead, allowing researchers to focus on linguistic preprocessing and dataset refinement, which are essential for advancing NLP in underrepresented languages.","short_abstract":"This study proposes a language-agnostic transformer-based POS tagging framework designed for low-resource languages, using Bangla and Hindi as case studies. With only three lines of framework-specific code, the model was adapted from Bangla to Hindi, demonstrating effective portability with minimal modification. The fr...","url_abs":"https://arxiv.org/abs/2512.00745","url_pdf":"https://arxiv.org/pdf/2512.00745v1","authors":"[\"Md Abdullah Al Kafi\",\"Sumit Kumar Banshal\"]","published":"2025-11-30T05:48:12Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Transformer\"]","has_code":false}
