{"ID":2923533,"CreatedAt":"2026-06-02T04:05:25.881865328Z","UpdatedAt":"2026-06-04T13:12:39.622923895Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.02487","arxiv_id":"2606.02487","title":"Towards Multidisciplinary Summarization of Hospital Stays: Efficient Sentence-Level Clinical Provenance Categorization","abstract":"Effective \"all-team\" summarization in high-complexity settings like the Neonatal Intensive Care Unit (NICU) requires aggregating insights from diverse disciplines (physicians, nurses, therapists) spread across hundreds of clinical free-text notes. Simply pooling heterogeneous text often leads to incoherent outputs. Structured summarization therefore first requires accurate categorization of sentence-level provenance across multi-source notes. This pilot study introduces a clinical provenance categorization pipeline using supervised fine-tuning (SFT) of large language models (LLMs). We adapted two Llama-3 models (8B and 70B) to MedSecId, a corpus of 2,002 MIMIC-III (Adult ICU) notes annotated with clinical provenance headers, achieving in-domain Macro F1 scores above 92% for both models. To evaluate cross-domain generalization, we assessed model capacity (8B vs. 70B) and quantization on a gold-standard dataset of 227 sentence-level spans derived from three multi-disciplinary NICU summaries. Experimental results demonstrate a scale-dependent transfer effect: while SFT produced only marginal changes for the 8B model, it substantially improved the 70B model, increasing Macro F1 by 7%. Notably, the quantized fine-tuned 70B model outperformed its full-precision baseline while substantially reducing computational requirements. These findings suggest that sufficient model capacity is critical for preserving semantic flexibility during cross-domain clinical transfer and that efficient quantized adaptation can enable structured provenance modeling for downstream summarization.","short_abstract":"Effective \"all-team\" summarization in high-complexity settings like the Neonatal Intensive Care Unit (NICU) requires aggregating insights from diverse disciplines (physicians, nurses, therapists) spread across hundreds of clinical free-text notes. Simply pooling heterogeneous text often leads to incoherent outputs. Str...","url_abs":"https://arxiv.org/abs/2606.02487","url_pdf":"https://arxiv.org/pdf/2606.02487v1","authors":"[\"Baris Karacan\",\"Vaibhav Bhargava\",\"Barbara Di Eugenio\",\"Natalie Parde\",\"Mary Khetani\",\"Yu-Shan Tseng\",\"Vanessa Barbosa\",\"Julie Vignato\",\"Lindsey Knake\",\"Rajashree Dahal\",\"Emily Spellman\",\"Danielle Hitzel\",\"Janine Petitgout\",\"Kristi Haughey\",\"Amanda Karstens\",\"Brianna Clarahan\",\"Rachel Dawson\",\"Lauren Boyd\",\"Mackenzie Weis\",\"Angie Tipton\",\"Jaewon Bae\",\"Catherine K. Craven\",\"Karen Dunn Lopez\",\"Andrew D. Boyd\"]","published":"2026-06-01T16:57:51Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
