{"ID":2887647,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.01401","arxiv_id":"2508.01401","title":"MedSynth: Realistic, Synthetic Medical Dialogue-Note Pairs","abstract":"Physicians spend significant time documenting clinical encounters, a burden that contributes to professional burnout. To address this, robust automation tools for medical documentation are crucial. We introduce MedSynth -- a novel dataset of synthetic medical dialogues and notes designed to advance the Dialogue-to-Note (Dial-2-Note) and Note-to-Dialogue (Note-2-Dial) tasks. Informed by an extensive analysis of disease distributions, this dataset includes over 10,000 dialogue-note pairs covering over 2000 ICD-10 codes. We demonstrate that our dataset markedly enhances the performance of models in generating medical notes from dialogues, and dialogues from medical notes. The dataset provides a valuable resource in a field where open-access, privacy-compliant, and diverse training data are scarce. Code is available at https://github.com/ahmadrezarm/MedSynth/tree/main and the dataset is available at https://huggingface.co/datasets/Ahmad0067/MedSynth.","short_abstract":"Physicians spend significant time documenting clinical encounters, a burden that contributes to professional burnout. To address this, robust automation tools for medical documentation are crucial. We introduce MedSynth -- a novel dataset of synthetic medical dialogues and notes designed to advance the Dialogue-to-Note...","url_abs":"https://arxiv.org/abs/2508.01401","url_pdf":"https://arxiv.org/pdf/2508.01401v1","authors":"[\"Ahmad Rezaie Mianroodi\",\"Amirali Rezaie\",\"Niko Grisel Todorov\",\"Cyril Rakovski\",\"Frank Rudzicz\"]","published":"2025-08-02T15:18:19Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\"]","methods":"[]","has_code":false,"code_links":[{"ID":611453,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2887647,"paper_url":"https://arxiv.org/abs/2508.01401","paper_title":"MedSynth: Realistic, Synthetic Medical Dialogue-Note Pairs","repo_url":"https://github.com/ahmadrezarm/MedSynth","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
