{"ID":2823254,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2601.00655","arxiv_id":"2601.00655","title":"Interpretability-Guided Bi-objective Optimization: Aligning Accuracy and Explainability","abstract":"This paper introduces Interpretability-Guided Bi-objective Optimization (IGBO), a framework that trains interpretable models by incorporating structured domain knowledge via a bi-objective formulation. IGBO encodes feature importance hierarchies as a Directed Acyclic Graph (DAG) via Central Limit Theorem-based construction and uses Temporal Integrated Gradients (TIG) to measure feature importance. The framework employs a novel Relative Importance Score Hk(X, θ) that quantifies the normalized cumulative attribution of each feature over time. We propose a geometric projection mapping P for combining task and interpretability gradients, and prove convergence to Pareto-stationary points. To address the Out-of-Distribution problem in TIG computation, we outline an Optimal Path Oracle architecture, which we leave for future work. Central Limit Theorem-based construction of the interpretability DAG provides statistical guarantees on acyclicity and transitivity, with an unconditional guarantee for the median threshold and conditional guarantees for higher confidence levels.","short_abstract":"This paper introduces Interpretability-Guided Bi-objective Optimization (IGBO), a framework that trains interpretable models by incorporating structured domain knowledge via a bi-objective formulation. IGBO encodes feature importance hierarchies as a Directed Acyclic Graph (DAG) via Central Limit Theorem-based construc...","url_abs":"https://arxiv.org/abs/2601.00655","url_pdf":"https://arxiv.org/pdf/2601.00655v3","authors":"[\"Kasra Fouladi\",\"Hamta Rahmani\"]","published":"2026-01-02T11:32:00Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\"]","methods":"[]","has_code":false}