Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Siva, Smriti, Cross-Zamirski, Jan
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.08117
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914314732961792
author	Siva, Smriti Cross-Zamirski, Jan
author_facet	Siva, Smriti Cross-Zamirski, Jan
contents	Rapid building damage assessment is critical for post-disaster response. Damage classification models built on satellite imagery provide a scalable means of obtaining situational awareness. However, label noise and severe class imbalance in satellite data create major challenges. The xBD dataset offers a standardized benchmark for building-level damage across diverse geographic regions. In this study, we evaluate Vision Transformer (ViT) model performance on the xBD dataset, specifically investigating how these models distinguish between types of structural damage when training on noisy, imbalanced data. In this study, we specifically evaluate DINOv2-small and DeiT for multi-class damage classification. We propose a targeted patch-based pre-processing pipeline to isolate structural features and minimize background noise in training. We adopt a frozen-head fine-tuning strategy to keep computational requirements manageable. Model performance is evaluated through accuracy, precision, recall, and macro-averaged F1 scores. We show that small ViT architectures with our novel training method achieves competitive macro-averaged F1 relative to prior CNN baselines for disaster classification.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_08117
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Building Damage Detection using Satellite Images and Patch-Based Transformer Methods Siva, Smriti Cross-Zamirski, Jan Computer Vision and Pattern Recognition Rapid building damage assessment is critical for post-disaster response. Damage classification models built on satellite imagery provide a scalable means of obtaining situational awareness. However, label noise and severe class imbalance in satellite data create major challenges. The xBD dataset offers a standardized benchmark for building-level damage across diverse geographic regions. In this study, we evaluate Vision Transformer (ViT) model performance on the xBD dataset, specifically investigating how these models distinguish between types of structural damage when training on noisy, imbalanced data. In this study, we specifically evaluate DINOv2-small and DeiT for multi-class damage classification. We propose a targeted patch-based pre-processing pipeline to isolate structural features and minimize background noise in training. We adopt a frozen-head fine-tuning strategy to keep computational requirements manageable. Model performance is evaluated through accuracy, precision, recall, and macro-averaged F1 scores. We show that small ViT architectures with our novel training method achieves competitive macro-averaged F1 relative to prior CNN baselines for disaster classification.
title	Building Damage Detection using Satellite Images and Patch-Based Transformer Methods
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2602.08117

Similar Items