Enregistré dans:
| Auteurs principaux: | , , |
|---|---|
| Format: | Preprint |
| Publié: |
2025
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2508.02508 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
| _version_ | 1866912520862695424 |
|---|---|
| author | Koo, Kyoseung Kim, Bogyeong Moon, Bongki |
| author_facet | Koo, Kyoseung Kim, Bogyeong Moon, Bongki |
| contents | Modern data analytic workloads increasingly require handling multiple data models simultaneously. Two primary approaches meet this need: polyglot persistence and multi-model database systems. Polyglot persistence employs a coordinator program to manage several independent database systems but suffers from high communication costs due to its physically disaggregated architecture. Meanwhile, existing multi-model database systems rely on a single storage engine optimized for a specific data model, resulting in inefficient processing across diverse data models. To address these limitations, we present M2, a multi-model analytic system with integrated storage engines. M2 treats all data models as first-class entities, composing query plans that incorporate operations across models. To effectively combine data from different models, the system introduces a specialized inter-model join algorithm called multi-stage hash join. Our evaluation demonstrates that M2 outperforms existing approaches by up to 188x speedup on multi-model analytics, confirming the effectiveness of our proposed techniques. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2508_02508 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | M2: An Analytic System with Specialized Storage Engines for Multi-Model Workloads Koo, Kyoseung Kim, Bogyeong Moon, Bongki Databases Modern data analytic workloads increasingly require handling multiple data models simultaneously. Two primary approaches meet this need: polyglot persistence and multi-model database systems. Polyglot persistence employs a coordinator program to manage several independent database systems but suffers from high communication costs due to its physically disaggregated architecture. Meanwhile, existing multi-model database systems rely on a single storage engine optimized for a specific data model, resulting in inefficient processing across diverse data models. To address these limitations, we present M2, a multi-model analytic system with integrated storage engines. M2 treats all data models as first-class entities, composing query plans that incorporate operations across models. To effectively combine data from different models, the system introduces a specialized inter-model join algorithm called multi-stage hash join. Our evaluation demonstrates that M2 outperforms existing approaches by up to 188x speedup on multi-model analytics, confirming the effectiveness of our proposed techniques. |
| title | M2: An Analytic System with Specialized Storage Engines for Multi-Model Workloads |
| topic | Databases |
| url | https://arxiv.org/abs/2508.02508 |