{"ID":2895572,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.08322","arxiv_id":"2507.08322","title":"Towards Efficient Quantity Retrieval from Text:An Approach via Description Parsing and Weak Supervision","abstract":"Quantitative facts are continually generated by companies and governments, supporting data-driven decision-making. While common facts are structured, many long-tail quantitative facts remain buried in unstructured documents, making them difficult to access. We propose the task of Quantity Retrieval: given a description of a quantitative fact, the system returns the relevant value and supporting evidence. Understanding quantity semantics in context is essential for this task. We introduce a framework based on description parsing that converts text into structured (description, quantity) pairs for effective retrieval. To improve learning, we construct a large paraphrase dataset using weak supervision based on quantity co-occurrence. We evaluate our approach on a large corpus of financial annual reports and a newly annotated quantity description dataset. Our method significantly improves top-1 retrieval accuracy from 30.98 percent to 64.66 percent.","short_abstract":"Quantitative facts are continually generated by companies and governments, supporting data-driven decision-making. While common facts are structured, many long-tail quantitative facts remain buried in unstructured documents, making them difficult to access. We propose the task of Quantity Retrieval: given a description...","url_abs":"https://arxiv.org/abs/2507.08322","url_pdf":"https://arxiv.org/pdf/2507.08322v2","authors":"[\"Yixuan Cao\",\"Zhengrong Chen\",\"Chengxuan Xia\",\"Kun Wu\",\"Ping Luo\"]","published":"2025-07-11T05:25:09Z","proceeding":"cs.IR","tasks":"[\"cs.IR\",\"cs.LG\"]","methods":"[]","has_code":false}
