Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.14708 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914332023980032 |
|---|---|
| author | Shaska, T. Kotsireas, I. |
| author_facet | Shaska, T. Kotsireas, I. |
| contents | Current distributed data fabrics lack a rigorous mathematical foundation, often relying on ad-hoc architectures that struggle with consistency, lineage, and scale. We propose a mathematical framework for data fabrics, unifying heterogeneous data management in distributed systems through a hypergraph-based structure \( \mathcal{F} = (D, M, G, T, P, A) \). Datasets, metadata, transformations, policies, and analytics are modeled over a distributed system \( Σ= (N, C) \), with multi-way relationships encoded in a hypergraph \( G = (V, E) \). A categorical approach, with datasets as objects and transformations as morphisms, supports operations like data integration and federated learning. The hypergraph is embedded into a modular tensor category, capturing relational symmetries via braided monoidal structures, with geometric analogies to Hurwitz spaces enriching the algebraic modeling. We prove the NP-hardness of critical tasks, such as schema matching and dynamic partitioning, and propose spectral methods and symmetry-based alignments for scalable solutions. The framework ensures consistency, completeness, and causality under CAP and CAL theorems, leveraging sparse incidence matrices and braiding actions for fault-tolerant operations. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_14708 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | A Unified Mathematical Framework for Distributed Data Fabrics: Categorical Hypergraph Models Shaska, T. Kotsireas, I. Databases Category Theory 68P15, 18M15, 05C65 H.2.5; H.2.1; G.2.2; F.3.2 Current distributed data fabrics lack a rigorous mathematical foundation, often relying on ad-hoc architectures that struggle with consistency, lineage, and scale. We propose a mathematical framework for data fabrics, unifying heterogeneous data management in distributed systems through a hypergraph-based structure \( \mathcal{F} = (D, M, G, T, P, A) \). Datasets, metadata, transformations, policies, and analytics are modeled over a distributed system \( Σ= (N, C) \), with multi-way relationships encoded in a hypergraph \( G = (V, E) \). A categorical approach, with datasets as objects and transformations as morphisms, supports operations like data integration and federated learning. The hypergraph is embedded into a modular tensor category, capturing relational symmetries via braided monoidal structures, with geometric analogies to Hurwitz spaces enriching the algebraic modeling. We prove the NP-hardness of critical tasks, such as schema matching and dynamic partitioning, and propose spectral methods and symmetry-based alignments for scalable solutions. The framework ensures consistency, completeness, and causality under CAP and CAL theorems, leveraging sparse incidence matrices and braiding actions for fault-tolerant operations. |
| title | A Unified Mathematical Framework for Distributed Data Fabrics: Categorical Hypergraph Models |
| topic | Databases Category Theory 68P15, 18M15, 05C65 H.2.5; H.2.1; G.2.2; F.3.2 |
| url | https://arxiv.org/abs/2602.14708 |