{"ID":2823099,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2601.01209","arxiv_id":"2601.01209","title":"OrchestrRL: Dynamic Compute and Network Orchestration for Disaggregated RL","abstract":"Disaggregating the generation and training stages in RL is widely adopted to scale LLM post-training. There are two critical challenges here. First, the generation stage often becomes a bottleneck due to dynamic workload shifts and severe execution imbalances. Second, the decoupled stages result in diverse and dynamic network traffic patterns that strain the conventional static fabric. We build OrchestrRL to orchestrate dynamically both compute and network in disaggregated RL. OrchestrRL employs an adaptive compute scheduler that adjusts parallelism configuration to match changing workload characteristics within and across generation steps. OrchestrRL adopts a reconfigurable optical-electrical fabric called RFabric: It leverages optical circuit switches to reconfigure the aggregation and core layers of the topology on demand, tailoring bandwidth resources to the unique communication patterns across various phases of training, generation, and weight synchronization. Evaluated on a 64-H800 GPU testbed, OrchestrRL demonstrates up to a 1.42x throughput improvement over static baselines. Using a high-fidelity simulator, we also show that RFabric achieves superior performance-cost efficiency at scale over static Fat-Tree networks.","short_abstract":"Disaggregating the generation and training stages in RL is widely adopted to scale LLM post-training. There are two critical challenges here. First, the generation stage often becomes a bottleneck due to dynamic workload shifts and severe execution imbalances. Second, the decoupled stages result in diverse and dynamic...","url_abs":"https://arxiv.org/abs/2601.01209","url_pdf":"https://arxiv.org/pdf/2601.01209v2","authors":"[\"Xin Tan\",\"Yicheng Feng\",\"Yu Zhou\",\"Yimin Jiang\",\"Yibo Zhu\",\"Hong Xu\"]","published":"2026-01-03T15:27:24Z","proceeding":"cs.DC","tasks":"[\"cs.DC\",\"cs.NI\"]","methods":"[\"Large Language Model\"]","has_code":false}