{"ID":2855715,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.12409","arxiv_id":"2510.12409","title":"PricingLogic: Evaluating LLMs Reasoning on Complex Tourism Pricing Tasks","abstract":"We present PricingLogic, the first benchmark that probes whether Large Language Models(LLMs) can reliably automate tourism-related prices when multiple, overlapping fare rules apply. Travel agencies are eager to offload this error-prone task onto AI systems; however, deploying LLMs without verified reliability could result in significant financial losses and erode customer trust. PricingLogic comprises 300 natural-language questions based on booking requests derived from 42 real-world pricing policies, spanning two levels of difficulty: (i) basic customer-type pricing and (ii)bundled-tour calculations involving interacting discounts. Evaluations of a line of LLMs reveal a steep performance drop on the harder tier,exposing systematic failures in rule interpretation and arithmetic reasoning.These results highlight that, despite their general capabilities, today's LLMs remain unreliable in revenue-critical applications without further safeguards or domain adaptation. Our code and dataset are available at https://github.com/EIT-NLP/PricingLogic.","short_abstract":"We present PricingLogic, the first benchmark that probes whether Large Language Models(LLMs) can reliably automate tourism-related prices when multiple, overlapping fare rules apply. Travel agencies are eager to offload this error-prone task onto AI systems; however, deploying LLMs without verified reliability could re...","url_abs":"https://arxiv.org/abs/2510.12409","url_pdf":"https://arxiv.org/pdf/2510.12409v1","authors":"[\"Yunuo Liu\",\"Dawei Zhu\",\"Zena Al-Khalili\",\"Dai Cheng\",\"Yanjun Chen\",\"Dietrich Klakow\",\"Wei Zhang\",\"Xiaoyu Shen\"]","published":"2025-10-14T11:42:15Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":608278,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2855715,"paper_url":"https://arxiv.org/abs/2510.12409","paper_title":"PricingLogic: Evaluating LLMs Reasoning on Complex Tourism Pricing Tasks","repo_url":"https://github.com/EIT-NLP/PricingLogic","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}