{"ID":2882547,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.11085","arxiv_id":"2508.11085","title":"A learning-driven automatic planning framework for proton PBS treatments of H\u0026N cancers","abstract":"Proton pencil beam scanning (PBS) treatment planning for head \u0026 neck (H\u0026N) cancers involves numerous conflicting objectives, requiring iterative objective parameter adjustments to balance multiple clinical goals. We propose a learning-driven inverse optimizer and integrate it into a proximal policy optimization (PPO)-based planning framework to automatically generate high-quality plans for patients with diverse treatment requirements. The inverse optimizer is a learning-to-optimize (L2O) method that predicts update steps by learning from task-specific data distributions. For the first time, long-context processing techniques developed for large language models (LLMs) are utilized to address the scalability limitations of existing L2O methods, enabling simultaneous optimization over a substantially large set of variables. The PPO framework functions as an outer-loop virtual planner, autonomously adjusting objective parameters through a policy network, and the inner-loop L2O inverse optimizer computes machine-deliverable spot monitor unit (MU) values based on the PPO-refined objectives. Moreover, a Swin UnetR dose predictor is trained with prescription- and beam-specific information to estimate the initial objective parameters. In our experiments, total 97 patients with bilateral or ipsilateral H\u0026N cancers are collected for training and testing. Compared with the second-order gradient-based methods, our L2O optimizer improves the effectiveness and efficiency of the time-consuming inverse optimization by 22.97% and 36.41%, respectively, and in conjunction with the PPO-based virtual planner, plans are generated within clinically acceptable times, i.e. 2.55 hours in average, and shows improved or comparable organs-at-risk sparing with superior target coverage compared with human-generated plans.","short_abstract":"Proton pencil beam scanning (PBS) treatment planning for head \u0026 neck (H\u0026N) cancers involves numerous conflicting objectives, requiring iterative objective parameter adjustments to balance multiple clinical goals. We propose a learning-driven inverse optimizer and integrate it into a proximal policy optimization (PPO)-b...","url_abs":"https://arxiv.org/abs/2508.11085","url_pdf":"https://arxiv.org/pdf/2508.11085v2","authors":"[\"Qingqing Wang\",\"Liqiang Xiao\",\"Chang Chang\"]","published":"2025-08-14T21:50:31Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.LG\"]","methods":"[\"Large Language Model\",\"Language Model\",\"Generative Adversarial Network\"]","has_code":false}