{"ID":2859579,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T23:40:06.587452691Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.06427","arxiv_id":"2510.06427","title":"Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser","abstract":"We introduce UniRST, the first unified RST-style discourse parser capable of handling 18 treebanks in 11 languages without modifying their relation inventories. To overcome inventory incompatibilities, we propose and evaluate two training strategies: Multi-Head, which assigns separate relation classification layer per inventory, and Masked-Union, which enables shared parameter training through selective label masking. We first benchmark monotreebank parsing with a simple yet effective augmentation technique for low-resource settings. We then train a unified model and show that (1) the parameter efficient Masked-Union approach is also the strongest, and (2) UniRST outperforms 16 of 18 mono-treebank baselines, demonstrating the advantages of a single-model, multilingual end-to-end discourse parsing across diverse resources.","short_abstract":"We introduce UniRST, the first unified RST-style discourse parser capable of handling 18 treebanks in 11 languages without modifying their relation inventories. To overcome inventory incompatibilities, we propose and evaluate two training strategies: Multi-Head, which assigns separate relation classification layer per...","url_abs":"https://arxiv.org/abs/2510.06427v1","url_pdf":"https://arxiv.org/pdf/2510.06427v1","authors":"Elena Chistova","published":"2025-10-07T20:06:55Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[]","has_code":false}
