{"ID":2879672,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.15220","arxiv_id":"2508.15220","title":"Locally Pareto-Optimal Interpretations for Black-Box Machine Learning Models","abstract":"Creating meaningful interpretations for black-box machine learning models involves balancing two often conflicting objectives: accuracy and explainability. Exploring the trade-off between these objectives is essential for developing trustworthy interpretations. While many techniques for multi-objective interpretation synthesis have been developed, they typically lack formal guarantees on the Pareto-optimality of the results. Methods that do provide such guarantees, on the other hand, often face severe scalability limitations when exploring the Pareto-optimal space. To address this, we develop a framework based on local optimality guarantees that enables more scalable synthesis of interpretations. Specifically, we consider the problem of synthesizing a set of Pareto-optimal interpretations with local optimality guarantees, within the immediate neighborhood of each solution. Our approach begins with a multi-objective learning or search technique, such as Multi-Objective Monte Carlo Tree Search, to generate a best-effort set of Pareto-optimal candidates with respect to accuracy and explainability. We then verify local optimality for each candidate as a Boolean satisfiability problem, which we solve using a SAT solver. We demonstrate the efficacy of our approach on a set of benchmarks, comparing it against previous methods for exploring the Pareto-optimal front of interpretations. In particular, we show that our approach yields interpretations that closely match those synthesized by methods offering global guarantees.","short_abstract":"Creating meaningful interpretations for black-box machine learning models involves balancing two often conflicting objectives: accuracy and explainability. Exploring the trade-off between these objectives is essential for developing trustworthy interpretations. While many techniques for multi-objective interpretation s...","url_abs":"https://arxiv.org/abs/2508.15220","url_pdf":"https://arxiv.org/pdf/2508.15220v1","authors":"[\"Aniruddha Joshi\",\"Supratik Chakraborty\",\"S Akshay\",\"Shetal Shah\",\"Hazem Torfah\",\"Sanjit Seshia\"]","published":"2025-08-21T04:11:20Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.LO\"]","methods":"[]","has_code":false}