Saved in:
Bibliographic Details
Main Authors: Zhang, Xu, Li, Danyang, Dong, Xiaohang, Wu, Tianhao, Yu, Hualong, Wang, Jianye, Li, Qicheng, Li, Xiang
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.02607
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911287136485376
author Zhang, Xu
Li, Danyang
Dong, Xiaohang
Wu, Tianhao
Yu, Hualong
Wang, Jianye
Li, Qicheng
Li, Xiang
author_facet Zhang, Xu
Li, Danyang
Dong, Xiaohang
Wu, Tianhao
Yu, Hualong
Wang, Jianye
Li, Qicheng
Li, Xiang
contents Change detection (CD) is a fundamental task for monitoring and analyzing land cover dynamics. While recent high performance models and high quality datasets have significantly advanced the field, a critical limitation persists. Current models typically acquire limited knowledge from single-type annotated data and cannot concurrently leverage diverse binary change detection (BCD) and semantic change detection (SCD) datasets. This constraint leads to poor generalization and limited versatility. The recent advancements in Multimodal Large Language Models (MLLMs) introduce new possibilities for a unified CD framework. We leverage the language priors and unification capabilities of MLLMs to develop UniChange, the first MLLM-based unified change detection model. UniChange integrates generative language abilities with specialized CD functionalities. Our model successfully unifies both BCD and SCD tasks through the introduction of three special tokens: [T1], [T2], and [CHANGE]. Furthermore, UniChange utilizes text prompts to guide the identification of change categories, eliminating the reliance on predefined classification heads. This design allows UniChange to effectively acquire knowledge from multi-source datasets, even when their class definitions conflict. Experiments on four public benchmarks (WHU-CD, S2Looking, LEVIR-CD+, and SECOND) demonstrate SOTA performance, achieving IoU scores of 90.41, 53.04, 78.87, and 57.62, respectively, surpassing all previous methods. The code is available at https://github.com/Erxucomeon/UniChange.
format Preprint
id arxiv_https___arxiv_org_abs_2511_02607
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle UniChange: Unifying Change Detection with Multimodal Large Language Model
Zhang, Xu
Li, Danyang
Dong, Xiaohang
Wu, Tianhao
Yu, Hualong
Wang, Jianye
Li, Qicheng
Li, Xiang
Computer Vision and Pattern Recognition
Computation and Language
Change detection (CD) is a fundamental task for monitoring and analyzing land cover dynamics. While recent high performance models and high quality datasets have significantly advanced the field, a critical limitation persists. Current models typically acquire limited knowledge from single-type annotated data and cannot concurrently leverage diverse binary change detection (BCD) and semantic change detection (SCD) datasets. This constraint leads to poor generalization and limited versatility. The recent advancements in Multimodal Large Language Models (MLLMs) introduce new possibilities for a unified CD framework. We leverage the language priors and unification capabilities of MLLMs to develop UniChange, the first MLLM-based unified change detection model. UniChange integrates generative language abilities with specialized CD functionalities. Our model successfully unifies both BCD and SCD tasks through the introduction of three special tokens: [T1], [T2], and [CHANGE]. Furthermore, UniChange utilizes text prompts to guide the identification of change categories, eliminating the reliance on predefined classification heads. This design allows UniChange to effectively acquire knowledge from multi-source datasets, even when their class definitions conflict. Experiments on four public benchmarks (WHU-CD, S2Looking, LEVIR-CD+, and SECOND) demonstrate SOTA performance, achieving IoU scores of 90.41, 53.04, 78.87, and 57.62, respectively, surpassing all previous methods. The code is available at https://github.com/Erxucomeon/UniChange.
title UniChange: Unifying Change Detection with Multimodal Large Language Model
topic Computer Vision and Pattern Recognition
Computation and Language
url https://arxiv.org/abs/2511.02607