{"ID":2832372,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.05367","arxiv_id":"2512.05367","title":"Enhancing Dimensionality Prediction in Hybrid Metal Halides via Feature Engineering and Class-Imbalance Mitigation","abstract":"We present a machine learning framework for predicting the structural dimensionality of hybrid metal halides (HMHs), including organic-inorganic perovskites, using a combination of chemically-informed feature engineering and advanced class-imbalance handling techniques. The dataset, consisting of 494 HMH structures, is highly imbalanced across dimensionality classes (0D, 1D, 2D, 3D), posing significant challenges to predictive modeling. This dataset was later augmented to 1336 via the Synthetic Minority Oversampling Technique (SMOTE) to mitigate the effects of the class imbalance. We developed interaction-based descriptors and integrated them into a multi-stage workflow that combines feature selection, model stacking, and performance optimization to improve dimensionality prediction accuracy. Our approach significantly improves F1-scores for underrepresented classes, achieving robust cross-validation performance across all dimensionalities.","short_abstract":"We present a machine learning framework for predicting the structural dimensionality of hybrid metal halides (HMHs), including organic-inorganic perovskites, using a combination of chemically-informed feature engineering and advanced class-imbalance handling techniques. The dataset, consisting of 494 HMH structures, is...","url_abs":"https://arxiv.org/abs/2512.05367","url_pdf":"https://arxiv.org/pdf/2512.05367v1","authors":"[\"Mariia Karabin\",\"Isaac Armstrong\",\"Leo Beck\",\"Paulina Apanel\",\"Markus Eisenbach\",\"David B. Mitzi\",\"Hanna Terletska\",\"Hendrik Heinz\"]","published":"2025-12-05T02:05:46Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Generative Adversarial Network\"]","has_code":false}
