{"ID":2894880,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.10281","arxiv_id":"2507.10281","title":"Toward Real-World Table Agents: Capabilities, Workflows, and Design Principles for LLM-based Table Intelligence","abstract":"Tables are fundamental in domains such as finance, healthcare, and public administration, yet real-world table tasks often involve noise, structural heterogeneity, and semantic complexity--issues underexplored in existing research that primarily targets clean academic datasets. This survey focuses on LLM-based Table Agents, which aim to automate table-centric workflows by integrating preprocessing, reasoning, and domain adaptation. We define five core competencies--C1: Table Structure Understanding, C2: Table and Query Semantic Understanding, C3: Table Retrieval and Compression, C4: Executable Reasoning with Traceability, and C5: Cross-Domain Generalization--to analyze and compare current approaches. In addition, a detailed examination of the Text-to-SQL Agent reveals a performance gap between academic benchmarks and real-world scenarios, especially for open-source models. Finally, we provide actionable insights to improve the robustness, generalization, and efficiency of LLM-based Table Agents in practical settings.","short_abstract":"Tables are fundamental in domains such as finance, healthcare, and public administration, yet real-world table tasks often involve noise, structural heterogeneity, and semantic complexity--issues underexplored in existing research that primarily targets clean academic datasets. This survey focuses on LLM-based Table Ag...","url_abs":"https://arxiv.org/abs/2507.10281","url_pdf":"https://arxiv.org/pdf/2507.10281v1","authors":"[\"Jiaming Tian\",\"Liyao Li\",\"Wentao Ye\",\"Haobo Wang\",\"Lingxin Wang\",\"Lihua Yu\",\"Zujie Ren\",\"Gang Chen\",\"Junbo Zhao\"]","published":"2025-07-14T13:48:13Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.DB\"]","methods":"[\"Large Language Model\"]","has_code":false}
