Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Knüpfer, Andreas, Callow, Timothy J.
Format: Preprint
Veröffentlicht: 2025
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2505.06558
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • We present a solution for research data version control and machine-actionable reproducibility of data processing for High Performance Computing (HPC) environments and the SLURM batch scheduler. Both aspects are important for research data management and the DataLad tool provides both based on the very prevalent git version control system. However, it is incompatible with HPC batch processing. The presented extension makes it compatible with HPC batch processing with the SLURM scheduler. It solves the fundamental incompatibility so that multiple jobs can be scheduled concurrently on the same data repository. It also avoids inefficient behavior patterns which may emerge on parallel file systems.