{"ID":2830301,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.19703","arxiv_id":"2512.19703","title":"ASK: Adaptive Self-improving Knowledge Framework for Audio Text Retrieval","abstract":"The dominant paradigm for Audio-Text Retrieval (ATR) relies on dual-encoder architectures optimized via mini-batch contrastive learning. However, restricting optimization to local in-batch samples creates a fundamental limitation we term the Gradient Locality Bottleneck (GLB), which prevents the resolution of acoustic ambiguities and hinders the learning of rare long-tail concepts. While external knowledge injection can break this bottleneck, it often triggers a problem called Representation-Drift Mismatch (RDM), where a static knowledge base becomes misaligned with evolving encoders, degrading guidance into noise. To address these intertwined challenges, we propose the Adaptive Self-improving Knowledge (ASK) framework. ASK breaks the GLB via multi-grained knowledge injection and mitigates RDM through a dynamic refinement strategy that synchronizes the knowledge base with the model. Additionally, an adaptive reliability weighting scheme is employed to filter retrieval noise based on cross-modal consistency. Extensive experiments across multiple benchmarks demonstrate that ASK consistently achieves new state-of-the-art performance across various backbones.","short_abstract":"The dominant paradigm for Audio-Text Retrieval (ATR) relies on dual-encoder architectures optimized via mini-batch contrastive learning. However, restricting optimization to local in-batch samples creates a fundamental limitation we term the Gradient Locality Bottleneck (GLB), which prevents the resolution of acoustic...","url_abs":"https://arxiv.org/abs/2512.19703","url_pdf":"https://arxiv.org/pdf/2512.19703v2","authors":"[\"Siyuan Fu\",\"Xuchen Guo\",\"Mingjun Liu\",\"Hongxiang Li\",\"Boyin Tan\",\"Gongxi Zhu\",\"Xianwei Zhuang\",\"Jinghan Ru\",\"Yuxin Xie\",\"Yuguo Yin\"]","published":"2025-12-11T14:48:30Z","proceeding":"eess.AS","tasks":"[\"eess.AS\",\"cs.IR\",\"cs.LG\",\"cs.MM\",\"cs.SD\"]","methods":"[]","has_code":false}
