{"ID":2878827,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.17212","arxiv_id":"2508.17212","title":"Reinforcement Learning enhanced Online Adaptive Clinical Decision Support via Digital Twin powered Policy and Treatment Effect optimized Reward","abstract":"Clinical decision support must adapt online under safety constraints. We present an online adaptive tool where reinforcement learning provides the policy, a patient digital twin provides the environment, and treatment effect defines the reward. The system initializes a batch-constrained policy from retrospective data and then runs a streaming loop that selects actions, checks safety, and queries experts only when uncertainty is high. Uncertainty comes from a compact ensemble of five Q-networks via the coefficient of variation of action values with a $\\tanh$ compression. The digital twin updates the patient state with a bounded residual rule. The outcome model estimates immediate clinical effect, and the reward is the treatment effect relative to a conservative reference with a fixed z-score normalization from the training split. Online updates operate on recent data with short runs and exponential moving averages. A rule-based safety gate enforces vital ranges and contraindications before any action is applied. Experiments in a synthetic clinical simulator show low latency, stable throughput, a low expert query rate at fixed safety, and improved return against standard value-based baselines. The design turns an offline policy into a continuous, clinician-supervised system with clear controls and fast adaptation.","short_abstract":"Clinical decision support must adapt online under safety constraints. We present an online adaptive tool where reinforcement learning provides the policy, a patient digital twin provides the environment, and treatment effect defines the reward. The system initializes a batch-constrained policy from retrospective data a...","url_abs":"https://arxiv.org/abs/2508.17212","url_pdf":"https://arxiv.org/pdf/2508.17212v1","authors":"[\"Xinyu Qin\",\"Ruiheng Yu\",\"Lu Wang\"]","published":"2025-08-24T04:51:22Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}