{"ID":2879622,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.16839","arxiv_id":"2508.16839","title":"One VLM, Two Roles: Stage-Wise Routing and Specialty-Level Deployment for Clinical Workflows","abstract":"Clinical ML workflows are often fragmented and inefficient: triage, task selection, and model deployment are handled by a patchwork of task-specific networks. These pipelines are rarely aligned with data-science practice, reducing efficiency and increasing operational cost. They also lack data-driven model identification (from imaging/tabular inputs) and standardized delivery of model outputs. We present a framework that employs a single vision-language model (VLM) in two complementary, modular roles. First (Solution 1): the VLM acts as an aware model-card matcher that routes an incoming image to the appropriate specialist model via a three-stage workflow (modality -\u003e primary abnormality -\u003e model-card ID). Reliability is improved by (i) stage-wise prompts enabling early termination via \"None\"/\"Other\" and (ii) a calibrated top-2 answer selector with a stage-wise cutoff. This raises routing accuracy by +9 and +11 percentage points on the training and held-out splits, respectively, compared with a baseline router, and improves held-out calibration (lower Expected Calibration Error, ECE). Second (Solution 2): we fine-tune the same VLM on specialty-specific datasets so that one model per specialty covers multiple downstream tasks, simplifying deployment while maintaining performance. Across gastroenterology, hematology, ophthalmology, pathology, and radiology, this single-model deployment matches or approaches specialized baselines. Together, these solutions reduce data-science effort through more accurate selection, simplify monitoring and maintenance by consolidating task-specific models, and increase transparency via per-stage justifications and calibrated thresholds. Each solution stands alone, and in combination they offer a practical, modular path from triage to deployment.","short_abstract":"Clinical ML workflows are often fragmented and inefficient: triage, task selection, and model deployment are handled by a patchwork of task-specific networks. These pipelines are rarely aligned with data-science practice, reducing efficiency and increasing operational cost. They also lack data-driven model identificati...","url_abs":"https://arxiv.org/abs/2508.16839","url_pdf":"https://arxiv.org/pdf/2508.16839v4","authors":"[\"Shayan Vassef\",\"Soorya Ram Shimegekar\",\"Abhay Goyal\",\"Koustuv Saha\",\"Pi Zonooz\",\"Navin Kumar\"]","published":"2025-08-22T23:34:37Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Language Model\"]","has_code":false}