Guardado en:
Detalles Bibliográficos
Autor principal: Lu, Jiaheng
Formato: Preprint
Publicado: 2025
Materias:
Acceso en línea:https://arxiv.org/abs/2502.19131
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866929732521558016
author Lu, Jiaheng
author_facet Lu, Jiaheng
contents Modern database systems face a significant challenge in effectively handling the Variety of data. The primary objective of this paper is to establish a unified data model and theoretical framework for multi-model data management. To achieve this, we present a categorical framework to unify three types of structured or semi-structured data: relation, XML, and graph-structured data. Utilizing the language of category theory, our framework offers a sound formal abstraction for representing these diverse data types. We extend the Entity-Relationship (ER) diagram with enriched semantic constraints, incorporating categorical ingredients such as pullback, pushout and limit. Furthermore, we develop a categorical normal form theory which is applied to category data to reduce redundancy and facilitate data maintenance. Those normal forms are applicable to relation, XML and graph data simultaneously, thereby eliminating the need for ad-hoc, model-specific definitions as found in separated normal form theories before. Finally, we discuss the connections between this new normal form framework and Boyce-Codd normal form, fourth normal form, and XML normal form.
format Preprint
id arxiv_https___arxiv_org_abs_2502_19131
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Categorical Unification for Multi-Model Data: Part I Categorical Model and Normal Forms
Lu, Jiaheng
Databases
Modern database systems face a significant challenge in effectively handling the Variety of data. The primary objective of this paper is to establish a unified data model and theoretical framework for multi-model data management. To achieve this, we present a categorical framework to unify three types of structured or semi-structured data: relation, XML, and graph-structured data. Utilizing the language of category theory, our framework offers a sound formal abstraction for representing these diverse data types. We extend the Entity-Relationship (ER) diagram with enriched semantic constraints, incorporating categorical ingredients such as pullback, pushout and limit. Furthermore, we develop a categorical normal form theory which is applied to category data to reduce redundancy and facilitate data maintenance. Those normal forms are applicable to relation, XML and graph data simultaneously, thereby eliminating the need for ad-hoc, model-specific definitions as found in separated normal form theories before. Finally, we discuss the connections between this new normal form framework and Boyce-Codd normal form, fourth normal form, and XML normal form.
title A Categorical Unification for Multi-Model Data: Part I Categorical Model and Normal Forms
topic Databases
url https://arxiv.org/abs/2502.19131