{"ID":2898221,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.03226","arxiv_id":"2507.03226","title":"Towards Practical GraphRAG: Efficient Knowledge Graph Construction and Hybrid Retrieval at Scale","abstract":"We propose a scalable and cost-efficient framework for deploying Graph-based Retrieval-Augmented Generation (GraphRAG) in enterprise environments. While GraphRAG has shown promise for multi- hop reasoning and structured retrieval, its adoption has been limited due to reliance on expensive large language model (LLM)-based extraction and complex traversal strategies. To address these challenges, we introduce two core innovations: (1) an efficient knowledge graph construction pipeline that leverages dependency parsing to achieve 94% of LLM-based performance (61.87% vs. 65.83%) while significantly reducing costs and improving scalability; and (2) a hybrid retrieval strategy that fuses vector similarity with graph traversal using Reciprocal Rank Fusion (RRF), maintaining separate embeddings for entities, chunks, and relations to enable multi-granular matching. We evaluate our framework on two enterprise datasets focused on legacy code migration and demonstrate improvements of up to 15% and 4.35% over vanilla vector retrieval baselines using LLM-as-Judge evaluation metrics. These results validate the feasibility of deploying GraphRAG in production enterprise environments, demonstrating that careful engineering of classical NLP techniques can match modern LLM-based approaches while enabling practical, cost-effective, and domain-adaptable retrieval-augmented reasoning at scale.","short_abstract":"We propose a scalable and cost-efficient framework for deploying Graph-based Retrieval-Augmented Generation (GraphRAG) in enterprise environments. While GraphRAG has shown promise for multi- hop reasoning and structured retrieval, its adoption has been limited due to reliance on expensive large language model (LLM)-bas...","url_abs":"https://arxiv.org/abs/2507.03226","url_pdf":"https://arxiv.org/pdf/2507.03226v3","authors":"[\"Congmin Min\",\"Sahil Bansal\",\"Joyce Pan\",\"Abbas Keshavarzi\",\"Rhea Mathew\",\"Amar Viswanathan Kannan\"]","published":"2025-07-04T00:05:55Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"RAG\",\"Large Language Model\",\"Language Model\"]","has_code":false}
