{"ID":2836754,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.19949","arxiv_id":"2511.19949","title":"PolarStore: High-Performance Data Compression for Large-Scale Cloud-Native Databases","abstract":"In recent years, resource elasticity and cost optimization have become essential for RDBMSs. While cloud-native RDBMSs provide elastic computing resources via disaggregated computing and storage, storage costs remain a critical user concern. Consequently, data compression emerges as an effective strategy to reduce storage costs. However, existing compression approaches in RDBMSs present a stark trade-off: software-based approaches incur significant performance overheads, while hardware-based alternatives lack the flexibility required for diverse database workloads. In this paper, we present PolarStore, a compressed shared storage system for cloud-native RDBMSs. PolarStore employs a dual-layer compression mechanism that combines in-storage compression in PolarCSD hardware with lightweight compression in software. This design leverages the strengths of both approaches. PolarStore also incorporates database-oriented optimizations to maintain high performance on critical I/O paths. Drawing from large-scale deployment experiences, we also introduce hardware improvements for PolarCSD to ensure host-level stability and propose a compression-aware scheduling scheme to improve cluster-level space efficiency. PolarStore is currently deployed on thousands of storage servers within PolarDB, managing over 100 PB of data. It achieves a compression ratio of 3.55 and reduces storage costs by approximately 60%. Remarkably, these savings are achieved while maintaining performance comparable to uncompressed clusters.","short_abstract":"In recent years, resource elasticity and cost optimization have become essential for RDBMSs. While cloud-native RDBMSs provide elastic computing resources via disaggregated computing and storage, storage costs remain a critical user concern. Consequently, data compression emerges as an effective strategy to reduce stor...","url_abs":"https://arxiv.org/abs/2511.19949","url_pdf":"https://arxiv.org/pdf/2511.19949v1","authors":"[\"Qingda Hu\",\"Xinjun Yang\",\"Feifei Li\",\"Junru Li\",\"Ya Lin\",\"Yuqi Zhou\",\"Yicong Zhu\",\"Junwei Zhang\",\"Rongbiao Xie\",\"Ling Zhou\",\"Bin Wu\",\"Wenchao Zhou\"]","published":"2025-11-25T05:55:48Z","proceeding":"cs.DC","tasks":"[\"cs.DC\",\"cs.DB\"]","methods":"[]","has_code":false}
