Saved in:
Bibliographic Details
Main Authors: Shaska, T., Kotsireas, I.
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.14708
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914332023980032
author Shaska, T.
Kotsireas, I.
author_facet Shaska, T.
Kotsireas, I.
contents Current distributed data fabrics lack a rigorous mathematical foundation, often relying on ad-hoc architectures that struggle with consistency, lineage, and scale. We propose a mathematical framework for data fabrics, unifying heterogeneous data management in distributed systems through a hypergraph-based structure \( \mathcal{F} = (D, M, G, T, P, A) \). Datasets, metadata, transformations, policies, and analytics are modeled over a distributed system \( Σ= (N, C) \), with multi-way relationships encoded in a hypergraph \( G = (V, E) \). A categorical approach, with datasets as objects and transformations as morphisms, supports operations like data integration and federated learning. The hypergraph is embedded into a modular tensor category, capturing relational symmetries via braided monoidal structures, with geometric analogies to Hurwitz spaces enriching the algebraic modeling. We prove the NP-hardness of critical tasks, such as schema matching and dynamic partitioning, and propose spectral methods and symmetry-based alignments for scalable solutions. The framework ensures consistency, completeness, and causality under CAP and CAL theorems, leveraging sparse incidence matrices and braiding actions for fault-tolerant operations.
format Preprint
id arxiv_https___arxiv_org_abs_2602_14708
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle A Unified Mathematical Framework for Distributed Data Fabrics: Categorical Hypergraph Models
Shaska, T.
Kotsireas, I.
Databases
Category Theory
68P15, 18M15, 05C65
H.2.5; H.2.1; G.2.2; F.3.2
Current distributed data fabrics lack a rigorous mathematical foundation, often relying on ad-hoc architectures that struggle with consistency, lineage, and scale. We propose a mathematical framework for data fabrics, unifying heterogeneous data management in distributed systems through a hypergraph-based structure \( \mathcal{F} = (D, M, G, T, P, A) \). Datasets, metadata, transformations, policies, and analytics are modeled over a distributed system \( Σ= (N, C) \), with multi-way relationships encoded in a hypergraph \( G = (V, E) \). A categorical approach, with datasets as objects and transformations as morphisms, supports operations like data integration and federated learning. The hypergraph is embedded into a modular tensor category, capturing relational symmetries via braided monoidal structures, with geometric analogies to Hurwitz spaces enriching the algebraic modeling. We prove the NP-hardness of critical tasks, such as schema matching and dynamic partitioning, and propose spectral methods and symmetry-based alignments for scalable solutions. The framework ensures consistency, completeness, and causality under CAP and CAL theorems, leveraging sparse incidence matrices and braiding actions for fault-tolerant operations.
title A Unified Mathematical Framework for Distributed Data Fabrics: Categorical Hypergraph Models
topic Databases
Category Theory
68P15, 18M15, 05C65
H.2.5; H.2.1; G.2.2; F.3.2
url https://arxiv.org/abs/2602.14708