{"ID":3083786,"CreatedAt":"2026-06-05T06:46:15.197025399Z","UpdatedAt":"2026-06-07T06:37:52.911886358Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.06056","arxiv_id":"2606.06056","title":"Metamorphic Testing with the Rashomon Set: Explanation Faithfulness in Machine Learning","abstract":"Multiple machine learning models can achieve near-equivalent predictive performance on the same task, yet provide divergent feature-based explanations. This is called the Rashomon effect of (explainable) machine learning, and it raises the question of which explanations, if any, are trustworthy. We propose a framework based on metamorphic testing that assesses explanation faithfulness without requiring ground-truth labels by exploring attributed feature importance from post-hoc explanation methods. Five metamorphic relations formalize expected consistency properties between model behavior and feature attributions. We apply this general framework to two tabular regression datasets and two post-hoc explainers (SHAP and LIME) to demonstrate the approach. The framework offers a practical, model-agnostic tool for selecting accurate models with reliable and trustworthy explanations.","short_abstract":"Multiple machine learning models can achieve near-equivalent predictive performance on the same task, yet provide divergent feature-based explanations. This is called the Rashomon effect of (explainable) machine learning, and it raises the question of which explanations, if any, are trustworthy. We propose a framework...","url_abs":"https://arxiv.org/abs/2606.06056","url_pdf":"https://arxiv.org/pdf/2606.06056v1","authors":"[\"Helge Spieker\",\"Jørn Eirik Betten\",\"Arnaud Gotlieb\"]","published":"2026-06-04T11:57:26Z","proceeding":"cs.SE","tasks":"[\"cs.SE\",\"cs.AI\",\"cs.LG\"]","methods":"[]","has_code":false}