SearchGym: A Modular Infrastructure for Cross-Platform Benchmarking and Hybrid Search Orchestration
Abstract
The rapid growth of Retrieval-Augmented Generation (RAG) has created a proliferation of toolkits, yet a fundamental gap remains between experimental prototypes and robust, production-ready systems. We present SearchGym, a modular infrastructure designed for cross-platform benchmarking and hybrid search orchestration. Unlike existing model-centric frameworks, SearchGym decouples data representation, embedding strategies, and retrieval logic into stateful abstractions: Dataset, VectorSet, and App. This separation enables a Compositional Config Algebra, allowing designers to synthesize entire systems from hierarchical configurations while ensuring perfect reproducibility. Moreover, we analyze the "Top-$k$ Cognizance" in hybrid retrieval pipelines, demonstrating that the optimal sequence of semantic ranking and structured filtering is highly dependent on filter strength. Evaluated on the LitSearch expert-annotated benchmark, SearchGym achieves a 70% Top-100 retrieval rate. SearchGym reveals a design tension between generalizability and optimizability, presenting the potential where engineering optimization may serve as a tool for uncovering the causal mechanisms inherent in information retrieval across heterogeneous domains. An open-source implementation of SearchGym is available at: https://github.com/JeromeTH/search-gym