{"ID":2884906,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.06692","arxiv_id":"2508.06692","title":"Stabilizing Federated Learning under Extreme Heterogeneity with HeteRo-Select","abstract":"Federated Learning (FL) is a machine learning technique that often suffers from training instability due to the diverse nature of client data. Although utility-based client selection methods like Oort are used to converge by prioritizing high-loss clients, they frequently experience significant drops in accuracy during later stages of training. We propose a theoretical HeteRo-Select framework designed to maintain high performance and ensure long-term training stability. We provide a theoretical analysis showing that when client data is very different (high heterogeneity), choosing a smart subset of client participation can reduce communication more effectively compared to full participation. Our HeteRo-Select method uses a clear, step-by-step scoring system that considers client usefulness, fairness, update speed, and data variety. It also shows convergence guarantees under strong regularization. Our experimental results on the CIFAR-10 dataset under significant label skew ($α=0.1$) support the theoretical findings. The HeteRo-Select method performs better than existing approaches in terms of peak accuracy, final accuracy, and training stability. Specifically, HeteRo-Select achieves a peak accuracy of $74.75\\%$, a final accuracy of $72.76\\%$, and a minimal stability drop of $1.99\\%$. In contrast, Oort records a lower peak accuracy of $73.98\\%$, a final accuracy of $71.25\\%$, and a larger stability drop of $2.73\\%$. The theoretical foundations and empirical performance in our study make HeteRo-Select a reliable solution for real-world heterogeneous FL problems.","short_abstract":"Federated Learning (FL) is a machine learning technique that often suffers from training instability due to the diverse nature of client data. Although utility-based client selection methods like Oort are used to converge by prioritizing high-loss clients, they frequently experience significant drops in accuracy during...","url_abs":"https://arxiv.org/abs/2508.06692","url_pdf":"https://arxiv.org/pdf/2508.06692v1","authors":"[\"Md. Akmol Masud\",\"Md Abrar Jahin\",\"Mahmud Hasan\"]","published":"2025-08-08T20:33:34Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}