{"ID":2888392,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.23568","arxiv_id":"2507.23568","title":"Optimised Feature Subset Selection via Simulated Annealing","abstract":"We introduce SA-FDR, a novel algorithm for $\\ell_0$-norm feature selection that considers this task as a combinatorial optimisation problem and solves it by using simulated annealing to perform a global search over the space of feature subsets. The optimisation is guided by the Fisher discriminant ratio, which we use as a computationally efficient proxy for model quality in classification tasks. Our experiments, conducted on datasets with up to hundreds of thousands of samples and hundreds of features, demonstrate that SA-FDR consistently selects more compact feature subsets while achieving a high predictive accuracy. This ability to recover informative yet minimal sets of features stems from its capacity to capture inter-feature dependencies often missed by greedy optimisation approaches. As a result, SA-FDR provides a flexible and effective solution for designing interpretable models in high-dimensional settings, particularly when model sparsity, interpretability, and performance are crucial.","short_abstract":"We introduce SA-FDR, a novel algorithm for $\\ell_0$-norm feature selection that considers this task as a combinatorial optimisation problem and solves it by using simulated annealing to perform a global search over the space of feature subsets. The optimisation is guided by the Fisher discriminant ratio, which we use a...","url_abs":"https://arxiv.org/abs/2507.23568","url_pdf":"https://arxiv.org/pdf/2507.23568v1","authors":"[\"Fernando Martínez-García\",\"Álvaro Rubio-García\",\"Samuel Fernández-Lorenzo\",\"Juan José García-Ripoll\",\"Diego Porras\"]","published":"2025-07-31T13:57:38Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cond-mat.stat-mech\",\"stat.ML\"]","methods":"[]","has_code":false}
