{"ID":2867740,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.17807","arxiv_id":"2509.17807","title":"Everyday Physics in Korean Contexts: A Culturally Grounded Physical Reasoning Benchmark","abstract":"Existing physical commonsense reasoning benchmarks predominantly focus on Western contexts, overlooking cultural variations in physical problem-solving. To address this gap, we introduce EPiK (Everyday Physics in Korean Contexts), a novel benchmark comprising 181 binary-choice problems that test physical reasoning within Korean cultural contexts, ranging from kimchi (Korean food) to traditional fermentation. EPiK is constructed using a two-stage generation and verification pipeline to create culturally-authentic problems across 9 reasoning subtasks and 84 scenarios. Unlike approaches based on simple translation, our method generates problems organically from Korean contexts while upholding rigorous physical reasoning standards. Our evaluations show that Korean-specialized models consistently outperform general-purpose models of comparable size. This performance gap highlights the limitations of culturally-agnostic models and demonstrates the critical need for culturally-aware benchmarks to truly measure language understanding. Our EPiK is publicly available at https://huggingface.co/datasets/jjae/EPiK.","short_abstract":"Existing physical commonsense reasoning benchmarks predominantly focus on Western contexts, overlooking cultural variations in physical problem-solving. To address this gap, we introduce EPiK (Everyday Physics in Korean Contexts), a novel benchmark comprising 181 binary-choice problems that test physical reasoning with...","url_abs":"https://arxiv.org/abs/2509.17807","url_pdf":"https://arxiv.org/pdf/2509.17807v2","authors":"[\"Jihae Jeong\",\"DaeYeop Lee\",\"DongGeon Lee\",\"Hwanjo Yu\"]","published":"2025-09-22T14:01:14Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false}
