{"ID":2866363,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.19757","arxiv_id":"2509.19757","title":"ARCADE: A Real-Time Data System for Hybrid and Continuous Query Processing across Diverse Data Modalities","abstract":"The explosive growth of multimodal data - spanning text, image, video, spatial, and relational modalities, coupled with the need for real-time semantic search and retrieval over these data - has outpaced the capabilities of existing multimodal and real-time database systems, which either lack efficient ingestion and continuous query capability, or fall short in supporting expressive hybrid analytics. We introduce ARCADE, a real-time data system that efficiently supports high-throughput ingestion and expressive hybrid and continuous query processing across diverse data types. ARCADE introduces unified disk-based secondary index on LSM-based storage for vector, spatial, and text data modalities, a comprehensive cost-based query optimizer for hybrid queries, and an incremental materialized view framework for efficient continuous queries. Built on open-source RocksDB storage and MySQL query engine, ARCADE outperforms leading multimodal data systems by up to 7.4x on read-heavy and 1.4x on write-heavy workloads.","short_abstract":"The explosive growth of multimodal data - spanning text, image, video, spatial, and relational modalities, coupled with the need for real-time semantic search and retrieval over these data - has outpaced the capabilities of existing multimodal and real-time database systems, which either lack efficient ingestion and co...","url_abs":"https://arxiv.org/abs/2509.19757","url_pdf":"https://arxiv.org/pdf/2509.19757v1","authors":"[\"Jingyi Yang\",\"Songsong Mo\",\"Jiachen Shi\",\"Zihao Yu\",\"Kunhao Shi\",\"Xuchen Ding\",\"Gao Cong\"]","published":"2025-09-24T04:26:25Z","proceeding":"cs.DB","tasks":"[\"cs.DB\",\"cs.AI\"]","methods":"[]","has_code":false}
