{"ID":2868971,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.16326","arxiv_id":"2509.16326","title":"HARE: an entity and relation centric evaluation framework for histopathology reports","abstract":"Medical domain automated text generation is an active area of research and development; however, evaluating the clinical quality of generated reports remains a challenge, especially in instances where domain-specific metrics are lacking, e.g. histopathology. We propose HARE (Histopathology Automated Report Evaluation), a novel entity and relation centric framework, composed of a benchmark dataset, a named entity recognition (NER) model, a relation extraction (RE) model, and a novel metric, which prioritizes clinically relevant content by aligning critical histopathology entities and relations between reference and generated reports. To develop the HARE benchmark, we annotated 813 de-identified clinical diagnostic histopathology reports and 652 histopathology reports from The Cancer Genome Atlas (TCGA) with domain-specific entities and relations. We fine-tuned GatorTronS, a domain-adapted language model to develop HARE-NER and HARE-RE which achieved the highest overall F1-score (0.915) among the tested models. The proposed HARE metric outperformed traditional metrics including ROUGE and Meteor, as well as radiology metrics such as RadGraph-XL, with the highest correlation and the best regression to expert evaluations (higher than the second best method, GREEN, a large language model based radiology report evaluator, by Pearson $r = 0.168$, Spearman $ρ= 0.161$, Kendall $τ= 0.123$, $R^2 = 0.176$, $RMSE = 0.018$). We release HARE, datasets, and the models at https://github.com/knowlab/HARE to foster advancements in histopathology report generation, providing a robust framework for improving the quality of reports.","short_abstract":"Medical domain automated text generation is an active area of research and development; however, evaluating the clinical quality of generated reports remains a challenge, especially in instances where domain-specific metrics are lacking, e.g. histopathology. We propose HARE (Histopathology Automated Report Evaluation),...","url_abs":"https://arxiv.org/abs/2509.16326","url_pdf":"https://arxiv.org/pdf/2509.16326v1","authors":"[\"Yunsoo Kim\",\"Michal W. S. Ong\",\"Alex Shavick\",\"Honghan Wu\",\"Adam P. Levine\"]","published":"2025-09-19T18:12:19Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.CV\"]","methods":"[\"Language Model\"]","has_code":false,"code_links":[{"ID":609640,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2868971,"paper_url":"https://arxiv.org/abs/2509.16326","paper_title":"HARE: an entity and relation centric evaluation framework for histopathology reports","repo_url":"https://github.com/knowlab/HARE","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
