{"ID":2826634,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.18915","arxiv_id":"2512.18915","title":"QoS-Aware Load Balancing in the Computing Continuum via Multi-Player Bandits","abstract":"As computation shifts from the cloud to the edge to reduce processing latency and network traffic, the resulting Computing Continuum (CC) creates a dynamic environment where meeting strict Quality of Service (QoS) requirements and avoiding service instance overload becomes challenging. Existing methods often prioritize global metrics and overlook per-client QoS, which is crucial for latency-sensitive and reliability-critical applications. We propose QEdgeProxy, a decentralized QoS-aware load balancer that acts as a proxy between IoT devices and service instances in the CC. We formulate the load balancing problem as a Multi-Player Multi-Armed Bandit (MP-MAB) with heterogeneous rewards: Each load balancer autonomously selects service instances to maximize the probability of meeting its clients' QoS requirements by using Kernel Density Estimation (KDE) to estimate QoS success probabilities. Our load-balancing algorithm also incorporates an adaptive exploration mechanism to recover rapidly from performance shifts and non-stationary conditions. We present a Kubernetes-native QEdgeProxy implementation and evaluate it on an emulated CC testbed deployed on a K3s cluster with realistic network conditions and a latency-sensitive edge-AI workload. Results show that QEdgeProxy significantly outperforms proximity-based and reinforcement-learning baselines in per-client QoS satisfaction, while adapting effectively to load surges and changes in instance availability.","short_abstract":"As computation shifts from the cloud to the edge to reduce processing latency and network traffic, the resulting Computing Continuum (CC) creates a dynamic environment where meeting strict Quality of Service (QoS) requirements and avoiding service instance overload becomes challenging. Existing methods often prioritize...","url_abs":"https://arxiv.org/abs/2512.18915","url_pdf":"https://arxiv.org/pdf/2512.18915v2","authors":"[\"Ivan Čilić\",\"Ivana Podnar Žarko\",\"Pantelis Frangoudis\",\"Schahram Dustdar\"]","published":"2025-12-21T23:18:07Z","proceeding":"cs.NI","tasks":"[\"cs.NI\",\"cs.DC\"]","methods":"[\"LoRA\"]","has_code":false}
