{"ID":2869281,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.14848","arxiv_id":"2509.14848","title":"Multi-Fidelity Hybrid Reinforcement Learning via Information Gain Maximization","abstract":"Optimizing a reinforcement learning (RL) policy typically requires extensive interactions with a high-fidelity simulator of the environment, which are often costly or impractical. Offline RL addresses this problem by allowing training from pre-collected data, but its effectiveness is strongly constrained by the size and quality of the dataset. Hybrid offline-online RL leverages both offline data and interactions with a single simulator of the environment. In many real-world scenarios, however, multiple simulators with varying levels of fidelity and computational cost are available. In this work, we study multi-fidelity hybrid RL for policy optimization under a fixed cost budget. We introduce multi-fidelity hybrid RL via information gain maximization (MF-HRL-IGM), a hybrid offline-online RL algorithm that implements fidelity selection based on information gain maximization through a bootstrapping approach. Theoretical analysis establishes the no-regret property of MF-HRL-IGM, while empirical evaluations demonstrate its superior performance compared to existing benchmarks.","short_abstract":"Optimizing a reinforcement learning (RL) policy typically requires extensive interactions with a high-fidelity simulator of the environment, which are often costly or impractical. Offline RL addresses this problem by allowing training from pre-collected data, but its effectiveness is strongly constrained by the size an...","url_abs":"https://arxiv.org/abs/2509.14848","url_pdf":"https://arxiv.org/pdf/2509.14848v1","authors":"[\"Houssem Sifaou\",\"Osvaldo Simeone\"]","published":"2025-09-18T11:12:22Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"eess.SP\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}