{"ID":2857673,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.09553","arxiv_id":"2510.09553","title":"Hierarchical Indexing with Knowledge Enrichment for Multilingual Video Corpus Retrieval","abstract":"Retrieving relevant instructional videos from multilingual medical archives is crucial for answering complex, multi-hop questions across language boundaries. However, existing systems either compress hour-long videos into coarse embeddings or incur prohibitive costs for fine-grained matching. We tackle the Multilingual Video Corpus Retrieval (mVCR) task in the NLPCC-2025 M4IVQA challenge with a multi-stage framework that integrates multilingual semantics, domain terminology, and efficient long-form processing. Video subtitles are divided into semantically coherent chunks, enriched with concise knowledge-graph (KG) facts, and organized into a hierarchical tree whose node embeddings are generated by a language-agnostic multilingual encoder. At query time, the same encoder embeds the input question; a coarse-to-fine tree search prunes irrelevant branches, and only the top-ranked chunks are re-scored by a lightweight large language model (LLM). This design avoids exhaustive cross-encoder scoring while preserving chunk-level precision. Experiments on the mVCR test set demonstrate state-of-the-art performance, and ablation studies confirm the complementary contributions of KG enrichment, hierarchical indexing, and targeted LLM re-ranking. The proposed method offers an accurate and scalable solution for multilingual retrieval in specialized medical video collections.","short_abstract":"Retrieving relevant instructional videos from multilingual medical archives is crucial for answering complex, multi-hop questions across language boundaries. However, existing systems either compress hour-long videos into coarse embeddings or incur prohibitive costs for fine-grained matching. We tackle the Multilingual...","url_abs":"https://arxiv.org/abs/2510.09553","url_pdf":"https://arxiv.org/pdf/2510.09553v1","authors":"[\"Yu Wang\",\"Tianhao Tan\",\"Yifei Wang\"]","published":"2025-10-10T17:06:01Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\",\"Generative Adversarial Network\"]","has_code":false}
