{"ID":2836908,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.20201","arxiv_id":"2511.20201","title":"GHR-VQA: Graph-guided Hierarchical Relational Reasoning for Video Question Answering","abstract":"We propose GHR-VQA, Graph-guided Hierarchical Relational Reasoning for Video Question Answering (Video QA), a novel human-centric framework that incorporates scene graphs to capture intricate human-object interactions within video sequences. Unlike traditional pixel-based methods, each frame is represented as a scene graph and human nodes across frames are linked to a global root, forming the video-level graph and enabling cross-frame reasoning centered on human actors. The video-level graphs are then processed by Graph Neural Networks (GNNs), transforming them into rich, context-aware embeddings for efficient processing. Finally, these embeddings are integrated with question features in a hierarchical network operating across different abstraction levels, enhancing both local and global understanding of video content. This explicit human-rooted structure enhances interpretability by decomposing actions into human-object interactions and enables a more profound understanding of spatiotemporal dynamics. We validate our approach on the Action Genome Question Answering (AGQA) dataset, achieving significant performance improvements, including a 7.3% improvement in object-relation reasoning over the state of the art.","short_abstract":"We propose GHR-VQA, Graph-guided Hierarchical Relational Reasoning for Video Question Answering (Video QA), a novel human-centric framework that incorporates scene graphs to capture intricate human-object interactions within video sequences. Unlike traditional pixel-based methods, each frame is represented as a scene g...","url_abs":"https://arxiv.org/abs/2511.20201","url_pdf":"https://arxiv.org/pdf/2511.20201v1","authors":"[\"Dionysia Danai Brilli\",\"Dimitrios Mallis\",\"Vassilis Pitsikalis\",\"Petros Maragos\"]","published":"2025-11-25T11:24:25Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Graph Neural Network\"]","has_code":false}