Saved in:
Bibliographic Details
Main Authors: Kumar, Prasanna, Soni, Nishank, Munje, Gaurang
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.22237
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Distributed storage architectures are foundational to modern cloud-native infrastructure, yet a critical operational bottleneck persists within disaster recovery (DR) workflows: the dependence on content-based cryptographic hashing for data identification and synchronization. While hash-based deduplication is effective for storage efficiency in steady-state operation, it becomes a systemic liability during failover and failback events when hash indexes are stale, incomplete, or must be rebuilt following a crash. This paper precisely characterizes the operational conditions under which full or partial re-hashing becomes unavoidable. The paper also analyzes the downstream impact of cryptographic re-hashing on Recovery Time Objective (RTO) compliance, and proposes a generalized architectural shift toward deterministic, metadata-driven identification. The proposed framework assigns globally unique composite identifiers to data blocks at ingestion time-independent of content analysis enabling instantaneous delta computation during DR without any cryptographic overhead.