Salvato in:
Dettagli Bibliografici
Autori principali: Vargas-Solar, Genoveva, Darmont, Jérôme, Adorjan, Alejandro, Espinosa-Oviedo, Javier A., Hara, Carmem, Loudcher, Sabine, Motz, Regina, Musicante, Martin, Zechinelli-Martini, José-Luis
Natura: Preprint
Pubblicazione: 2024
Soggetti:
Accesso online:https://arxiv.org/abs/2403.20063
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866913289876799488
author Vargas-Solar, Genoveva
Darmont, Jérôme
Adorjan, Alejandro
Espinosa-Oviedo, Javier A.
Hara, Carmem
Loudcher, Sabine
Motz, Regina
Musicante, Martin
Zechinelli-Martini, José-Luis
author_facet Vargas-Solar, Genoveva
Darmont, Jérôme
Adorjan, Alejandro
Espinosa-Oviedo, Javier A.
Hara, Carmem
Loudcher, Sabine
Motz, Regina
Musicante, Martin
Zechinelli-Martini, José-Luis
contents This vision paper introduces a pioneering data lake architecture designed to meet Life \& Earth sciences' burgeoning data management needs. As the data landscape evolves, the imperative to navigate and maximize scientific opportunities has never been greater. Our vision paper outlines a strategic approach to unify and integrate diverse datasets, aiming to cultivate a collaborative space conducive to scientific discovery.The core of the design and construction of a data lake is the development of formal and semi-automatic tools, enabling the meticulous curation of quantitative and qualitative data from experiments. Our unique ''research-in-the-loop'' methodology ensures that scientists across various disciplines are integrally involved in the curation process, combining automated, mathematical, and manual tasks to address complex problems, from seismic detection to biodiversity studies. By fostering reproducibility and applicability of research, our approach enhances the integrity and impact of scientific experiments. This initiative is set to improve data management practices, strengthening the capacity of Life \& Earth sciences to solve some of our time's most critical environmental and biological challenges.
format Preprint
id arxiv_https___arxiv_org_abs_2403_20063
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Dataversifying Natural Sciences: Pioneering a Data Lake Architecture for Curated Data-Centric Experiments in Life \& Earth Sciences
Vargas-Solar, Genoveva
Darmont, Jérôme
Adorjan, Alejandro
Espinosa-Oviedo, Javier A.
Hara, Carmem
Loudcher, Sabine
Motz, Regina
Musicante, Martin
Zechinelli-Martini, José-Luis
Databases
This vision paper introduces a pioneering data lake architecture designed to meet Life \& Earth sciences' burgeoning data management needs. As the data landscape evolves, the imperative to navigate and maximize scientific opportunities has never been greater. Our vision paper outlines a strategic approach to unify and integrate diverse datasets, aiming to cultivate a collaborative space conducive to scientific discovery.The core of the design and construction of a data lake is the development of formal and semi-automatic tools, enabling the meticulous curation of quantitative and qualitative data from experiments. Our unique ''research-in-the-loop'' methodology ensures that scientists across various disciplines are integrally involved in the curation process, combining automated, mathematical, and manual tasks to address complex problems, from seismic detection to biodiversity studies. By fostering reproducibility and applicability of research, our approach enhances the integrity and impact of scientific experiments. This initiative is set to improve data management practices, strengthening the capacity of Life \& Earth sciences to solve some of our time's most critical environmental and biological challenges.
title Dataversifying Natural Sciences: Pioneering a Data Lake Architecture for Curated Data-Centric Experiments in Life \& Earth Sciences
topic Databases
url https://arxiv.org/abs/2403.20063