{"ID":2847203,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.13728","arxiv_id":"2511.13728","title":"Gaia: Hybrid Hardware Acceleration for Serverless AI in the 3D Compute Continuum","abstract":"Serverless computing offers elastic scaling and pay-per-use execution, making it well-suited for AI workloads. As these workloads run in heterogeneous environments such as the Edge-Cloud-Space 3D Continuum, they often require intensive parallel computation, which GPUs can perform far more efficiently than CPUs. However, current platforms struggle to manage hardware acceleration effectively, as static user-device assignments fail to ensure SLO compliance under varying loads or placements, and one-time dynamic selections often lead to suboptimal or cost-inefficient configurations. To address these issues, we present Gaia, a GPU-as-a-service model and architecture that makes hardware acceleration a platform concern. Gaia combines (i) a lightweight Execution Mode Identifier that inspects function code at deploy time to emit one of four execution modes, and a Dynamic Function Runtime that continuously reevaluates user-defined SLOs to promote or demote between CPU- and GPU backends. Our evaluation shows that it seamlessly selects the best hardware acceleration for the workload, reducing end-to-end latency by up to 95%. These results indicate that Gaia enables SLO-aware, cost-efficient acceleration for serverless AI across heterogeneous environments.","short_abstract":"Serverless computing offers elastic scaling and pay-per-use execution, making it well-suited for AI workloads. As these workloads run in heterogeneous environments such as the Edge-Cloud-Space 3D Continuum, they often require intensive parallel computation, which GPUs can perform far more efficiently than CPUs. However...","url_abs":"https://arxiv.org/abs/2511.13728","url_pdf":"https://arxiv.org/pdf/2511.13728v1","authors":"[\"Maximilian Reisecker\",\"Cynthia Marcelino\",\"Thomas Pusztai\",\"Stefan Nastic\"]","published":"2025-11-01T07:44:50Z","proceeding":"cs.DC","tasks":"[\"cs.DC\"]","methods":"[]","has_code":false}
