{"ID":2899268,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.01823","arxiv_id":"2507.01823","title":"TD-MPC-Opt: Distilling Model-Based Multi-Task Reinforcement Learning Agents","abstract":"We present a novel approach to knowledge transfer in model-based reinforcement learning, addressing the critical challenge of deploying large world models in resource-constrained environments. Our method efficiently distills a high-capacity multi-task agent (317M parameters) into a compact model (1M parameters) on the MT30 benchmark, significantly improving performance across diverse tasks. Our distilled model achieves a state-of-the-art normalized score of 28.45, surpassing the original 1M parameter model score of 18.93. This improvement demonstrates the ability of our distillation technique to capture and consolidate complex multi-task knowledge. We further optimize the distilled model through FP16 post-training quantization, reducing its size by $\\sim$50\\%. Our approach addresses practical deployment limitations and offers insights into knowledge representation in large world models, paving the way for more efficient and accessible multi-task reinforcement learning systems in robotics and other resource-constrained applications. Code available at https://github.com/dmytro-kuzmenko/td-mpc-opt.","short_abstract":"We present a novel approach to knowledge transfer in model-based reinforcement learning, addressing the critical challenge of deploying large world models in resource-constrained environments. Our method efficiently distills a high-capacity multi-task agent (317M parameters) into a compact model (1M parameters) on the...","url_abs":"https://arxiv.org/abs/2507.01823","url_pdf":"https://arxiv.org/pdf/2507.01823v1","authors":"[\"Dmytro Kuzmenko\",\"Nadiya Shvai\"]","published":"2025-07-02T15:38:49Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.RO\"]","methods":"[\"Reinforcement Learning\"]","has_code":false,"code_links":[{"ID":612465,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2899268,"paper_url":"https://arxiv.org/abs/2507.01823","paper_title":"TD-MPC-Opt: Distilling Model-Based Multi-Task Reinforcement Learning Agents","repo_url":"https://github.com/dmytro-kuzmenko/td-mpc-opt","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
