Saved in:
Bibliographic Details
Main Authors: Köhler, Cristiano André, Ulianych, Danylo, Grün, Sonja, Decker, Stefan, Denker, Michael
Format: Preprint
Published: 2023
Subjects:
Online Access:https://arxiv.org/abs/2311.09672
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913397631614976
author Köhler, Cristiano André
Ulianych, Danylo
Grün, Sonja
Decker, Stefan
Denker, Michael
author_facet Köhler, Cristiano André
Ulianych, Danylo
Grün, Sonja
Decker, Stefan
Denker, Michael
contents Scientific research demands reproducibility and transparency, particularly in data-intensive fields like electrophysiology. Electrophysiology data is typically analyzed using scripts that generate output files, including figures. Handling these results poses several challenges due to the complexity and interactivity of the analysis process. These stem from the difficulty to discern the analysis steps, parameters, and data flow from the results, making knowledge transfer and findability challenging in collaborative settings. Provenance information tracks data lineage and processes applied to it, and provenance capture during the execution of an analysis script can address those challenges. We present Alpaca (Automated Lightweight Provenance Capture), a tool that captures fine-grained provenance information with minimal user intervention when running data analysis pipelines implemented in Python scripts. Alpaca records inputs, outputs, and function parameters and structures information according to the W3C PROV standard. We demonstrate the tool using a realistic use case involving multichannel local field potential recordings of a neurophysiological experiment, highlighting how the tool makes result details known in a standardized manner in order to address the challenges of the analysis process. Ultimately, using Alpaca will help to represent results according to the FAIR principles, which will improve research reproducibility and facilitate sharing the results of data analyses.
format Preprint
id arxiv_https___arxiv_org_abs_2311_09672
institution arXiv
publishDate 2023
record_format arxiv
spellingShingle Facilitating the sharing of electrophysiology data analysis results through in-depth provenance capture
Köhler, Cristiano André
Ulianych, Danylo
Grün, Sonja
Decker, Stefan
Denker, Michael
Neurons and Cognition
Scientific research demands reproducibility and transparency, particularly in data-intensive fields like electrophysiology. Electrophysiology data is typically analyzed using scripts that generate output files, including figures. Handling these results poses several challenges due to the complexity and interactivity of the analysis process. These stem from the difficulty to discern the analysis steps, parameters, and data flow from the results, making knowledge transfer and findability challenging in collaborative settings. Provenance information tracks data lineage and processes applied to it, and provenance capture during the execution of an analysis script can address those challenges. We present Alpaca (Automated Lightweight Provenance Capture), a tool that captures fine-grained provenance information with minimal user intervention when running data analysis pipelines implemented in Python scripts. Alpaca records inputs, outputs, and function parameters and structures information according to the W3C PROV standard. We demonstrate the tool using a realistic use case involving multichannel local field potential recordings of a neurophysiological experiment, highlighting how the tool makes result details known in a standardized manner in order to address the challenges of the analysis process. Ultimately, using Alpaca will help to represent results according to the FAIR principles, which will improve research reproducibility and facilitate sharing the results of data analyses.
title Facilitating the sharing of electrophysiology data analysis results through in-depth provenance capture
topic Neurons and Cognition
url https://arxiv.org/abs/2311.09672