{"ID":3004912,"CreatedAt":"2026-06-03T03:09:48.883664427Z","UpdatedAt":"2026-06-05T10:21:46.366257699Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.03461","arxiv_id":"2606.03461","title":"What Makes Interaction Trajectories Effective for Training Terminal Agents?","abstract":"Stronger code agents are commonly assumed to be superior teachers for post-training, yet this assumption remains poorly disentangled from task difficulty, harness design, and student capacity. We investigate this pedagogical link using Terminal-Lego, a scalable pipeline that transforms multi-domain real-world issues into environment-verified agentic tasks. Surprisingly, standalone performance does not dictate teaching efficacy: while Claude Opus 4.6 achieves higher scores on Terminal-Bench 2.0, students fine-tuned on trajectories from DeepSeek-V3.2, a lower-scoring agent, exhibit significantly stronger generalization. We attribute this \"pedagogical paradox\" to Environment-Grounded Supervision (EGS): trajectories that explicitly expose inspect-act-verify behaviors through harness-visible interactions allow students to internalize robust problem-solving routines rather than fragile action sequences. Scaling analysis reveals exceptional data efficiency: with only 15.3k Terminal-Lego trajectories, for example, Qwen3-32B achieves a 24.3% score on Terminal-Bench 2.0, rivaling previous SOTA performance established with over 30x the data volume. Our results suggest that the frontier of agent post-training lies beyond mere outcome-matching, shifting the focus toward \"Harness Engineering\", where the systematic design of environment-grounded interaction structures serves as the primary catalyst for reproducible and generalizable agentic intelligence.","short_abstract":"Stronger code agents are commonly assumed to be superior teachers for post-training, yet this assumption remains poorly disentangled from task difficulty, harness design, and student capacity. We investigate this pedagogical link using Terminal-Lego, a scalable pipeline that transforms multi-domain real-world issues in...","url_abs":"https://arxiv.org/abs/2606.03461","url_pdf":"https://arxiv.org/pdf/2606.03461v1","authors":"[\"Sidi Yang\",\"Chaofan Tao\",\"Jierun Chen\",\"Tiezheng Yu\",\"Ruoyu Wang\",\"Yuxin Jiang\",\"Yiming Du\",\"Wendong Xu\",\"Jing Xiong\",\"Taiqiang Wu\",\"Lifeng Shang\",\"Xiaohui Li\",\"Ngai Wong\",\"Haoli Bai\"]","published":"2026-06-02T10:37:47Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[]","has_code":false}
