{"ID":2863291,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.24276","arxiv_id":"2509.24276","title":"G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge","abstract":"Large language models (LLMs) excel at complex reasoning but remain limited by static and incomplete parametric knowledge. Retrieval-augmented generation (RAG) mitigates this by incorporating external knowledge, yet existing RAGs struggle with knowledge-intensive tasks due to fragmented information and weak modeling of knowledge structure. Graphs offer a natural way to model relationships within knowledge, but LLMs are inherently unstructured and cannot effectively reason over graph-structured data. Recent graph-enhanced RAG (GraphRAG) attempts to bridge this gap by constructing tailored graphs and enabling LLMs to reason on them. However, these methods often depend on ad-hoc graph designs, heuristic search, or costly agent pipelines, which hinder scalability and generalization. To address these challenges, we present G-reasoner, a unified framework that integrates graph and language foundation models for scalable reasoning over diverse graph-structured knowledge. Central to our approach is QuadGraph, a standardized four-layer abstraction that unifies heterogeneous knowledge sources into a common graph representation. Building on this, we introduce a 34M-parameter graph foundation model (GFM) that jointly captures graph topology and textual semantics, and is integrated with LLMs to enhance reasoning in downstream applications. To ensure scalability and efficiency, mixed-precision training and distributed message-passing are implemented to scale GFM with more GPUs. Extensive experiments on six benchmarks show that G-reasoner consistently outperforms state-of-the-art baselines, significantly enhances LLM reasoning, and achieves strong efficiency and cross-graph generalization.","short_abstract":"Large language models (LLMs) excel at complex reasoning but remain limited by static and incomplete parametric knowledge. Retrieval-augmented generation (RAG) mitigates this by incorporating external knowledge, yet existing RAGs struggle with knowledge-intensive tasks due to fragmented information and weak modeling of...","url_abs":"https://arxiv.org/abs/2509.24276","url_pdf":"https://arxiv.org/pdf/2509.24276v4","authors":"[\"Linhao Luo\",\"Zicheng Zhao\",\"Junnan Liu\",\"Zhangchi Qiu\",\"Junnan Dong\",\"Serge Panev\",\"Chen Gong\",\"Thuy-Trang Vu\",\"Gholamreza Haffari\",\"Dinh Phung\",\"Alan Wee-Chung Liew\",\"Shirui Pan\"]","published":"2025-09-29T04:38:12Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"RAG\",\"Large Language Model\",\"Language Model\"]","has_code":false}