{"ID":2857237,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.08876","arxiv_id":"2510.08876","title":"Vector Graph-Based Repository Understanding for Issue-Driven File Retrieval","abstract":"We present a repository decomposition system that converts large software repositories into a vectorized knowledge graph which mirrors project architectural and semantic structure, capturing semantic relationships and allowing a significant level of automatization of further repository development. The graph encodes syntactic relations such as containment, implementation, references, calls, and inheritance, and augments nodes with LLM-derived summaries and vector embeddings. A hybrid retrieval pipeline combines semantic retrieval with graph-aware expansion, and an LLM-based assistant formulates constrained, read-only graph requests and produces human-oriented explanations.","short_abstract":"We present a repository decomposition system that converts large software repositories into a vectorized knowledge graph which mirrors project architectural and semantic structure, capturing semantic relationships and allowing a significant level of automatization of further repository development. The graph encodes sy...","url_abs":"https://arxiv.org/abs/2510.08876","url_pdf":"https://arxiv.org/pdf/2510.08876v1","authors":"[\"Kostiantyn Bevziuk\",\"Andrii Fatula\",\"Svetozar Lashin Yaroslav Opanasenko\",\"Anna Tukhtarova\",\"Ashok Jallepalli Pradeepkumar Sharma\",\"Hritvik Shrivastava\"]","published":"2025-10-10T00:13:50Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.AI\"]","methods":"[\"Large Language Model\"]","has_code":false}
