Saved in:
Bibliographic Details
Main Authors: Agnihotri, Shashank, Schader, David, Sharei, Nico, Kaçar, Mehmet Ege, Keuper, Margret
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.04835
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Deep learning (DL) models are widely used in real-world applications but remain vulnerable to distribution shifts, especially due to weather and lighting changes. Collecting diverse real-world data for testing the robustness of DL models is resource-intensive, making synthetic corruptions an attractive alternative for robustness testing. However, are synthetic corruptions a reliable proxy for real-world corruptions? To answer this, we conduct the largest benchmarking study on semantic segmentation models, comparing performance on real-world corruptions and synthetic corruptions datasets. Our results reveal a strong correlation in mean performance, supporting the use of synthetic corruptions for robustness evaluation. We further analyze corruption-specific correlations, providing key insights to understand when synthetic corruptions succeed in representing real-world corruptions. Open-source Code: https://github.com/shashankskagnihotri/benchmarking_robustness/tree/segmentation_david/semantic_segmentation