{"ID":3005050,"CreatedAt":"2026-06-03T03:09:48.883664427Z","UpdatedAt":"2026-06-05T07:32:30.50480903Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.03284","arxiv_id":"2606.03284","title":"SEA-NLI: Natural Language Inference as a Lens into Southeast Asian Cultural Understanding","abstract":"Frontier LLMs perform well in Western contexts, but remain poorly tested on underrepresented cultures such as those in Southeast Asia (SEA). Existing NLI benchmarks are largely Western-centric, translation-derived, or monolingual, limiting their ability to measure culturally grounded reasoning. We introduce SEA-NLI, a native, culturally grounded NLI benchmark covering eight SEA countries in English and native regional languages, verified by native speakers. Across 17 encoder and decoder models, we observe a low performance from all models, especially for knowledge-intensive categories such as Languages and Science and Technology. Our analysis shows that failure cases mainly stem from missing SEA cultural knowledge: SEA-adapted models and culture-aware prompting improve performance, while CoT prompting offers limited gains.","short_abstract":"Frontier LLMs perform well in Western contexts, but remain poorly tested on underrepresented cultures such as those in Southeast Asia (SEA). Existing NLI benchmarks are largely Western-centric, translation-derived, or monolingual, limiting their ability to measure culturally grounded reasoning. We introduce SEA-NLI, a...","url_abs":"https://arxiv.org/abs/2606.03284","url_pdf":"https://arxiv.org/pdf/2606.03284v1","authors":"[\"Peerawat Chomphooyod\",\"Jian Gang Ngui\",\"Yosephine Susanto\",\"Attapol T. Rutherford\",\"Alham Fikri Aji\",\"Sarana Nutanong\",\"Can Udomcharoenchaikit\",\"Peerat Limkonchotiwat\"]","published":"2026-06-02T07:49:50Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\"]","has_code":false}