{"ID":2886987,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.02508","arxiv_id":"2508.02508","title":"M2: An Analytic System with Specialized Storage Engines for Multi-Model Workloads","abstract":"Modern data analytic workloads increasingly require handling multiple data models simultaneously. Two primary approaches meet this need: polyglot persistence and multi-model database systems. Polyglot persistence employs a coordinator program to manage several independent database systems but suffers from high communication costs due to its physically disaggregated architecture. Meanwhile, existing multi-model database systems rely on a single storage engine optimized for a specific data model, resulting in inefficient processing across diverse data models. To address these limitations, we present M2, a multi-model analytic system with integrated storage engines. M2 treats all data models as first-class entities, composing query plans that incorporate operations across models. To effectively combine data from different models, the system introduces a specialized inter-model join algorithm called multi-stage hash join. Our evaluation demonstrates that M2 outperforms existing approaches by up to 188x speedup on multi-model analytics, confirming the effectiveness of our proposed techniques.","short_abstract":"Modern data analytic workloads increasingly require handling multiple data models simultaneously. Two primary approaches meet this need: polyglot persistence and multi-model database systems. Polyglot persistence employs a coordinator program to manage several independent database systems but suffers from high communic...","url_abs":"https://arxiv.org/abs/2508.02508","url_pdf":"https://arxiv.org/pdf/2508.02508v2","authors":"[\"Kyoseung Koo\",\"Bogyeong Kim\",\"Bongki Moon\"]","published":"2025-08-04T15:15:29Z","proceeding":"cs.DB","tasks":"[\"cs.DB\"]","methods":"[]","has_code":false}
