Saved in:
| Main Authors: | Zhu, Duowang, Huang, Xiaohu, Huang, Haiyan, Zhou, Hao, Shao, Zhenfeng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.18803 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ChangeViT: Unleashing Plain Vision Transformers for Change Detection
by: Zhu, Duowang, et al.
Published: (2024)
by: Zhu, Duowang, et al.
Published: (2024)
PruneVid: Visual Token Pruning for Efficient Video Large Language Models
by: Huang, Xiaohu, et al.
Published: (2024)
by: Huang, Xiaohu, et al.
Published: (2024)
Imagine How To Change: Explicit Procedure Modeling for Change Captioning
by: Sun, Jiayang, et al.
Published: (2026)
by: Sun, Jiayang, et al.
Published: (2026)
MV-CC: Mask Enhanced Video Model for Remote Sensing Change Caption
by: Liu, Ruixun, et al.
Published: (2024)
by: Liu, Ruixun, et al.
Published: (2024)
Hierarchical Dual-Change Collaborative Learning for UAV Scene Change Captioning
by: Chen, Fuhai, et al.
Published: (2026)
by: Chen, Fuhai, et al.
Published: (2026)
Pixel-Level Change Detection Pseudo-Label Learning for Remote Sensing Change Captioning
by: Liu, Chenyang, et al.
Published: (2023)
by: Liu, Chenyang, et al.
Published: (2023)
SChanger: Change Detection from a Semantic Change and Spatial Consistency Perspective
by: Zhou, Ziyu, et al.
Published: (2025)
by: Zhou, Ziyu, et al.
Published: (2025)
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks
by: Wu, Peiran, et al.
Published: (2025)
by: Wu, Peiran, et al.
Published: (2025)
3D CoCa: Contrastive Learners are 3D Captioners
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
UAV as Urban Construction Change Monitor: A New Benchmark and Change Captioning Model
by: Gao, Yupeng, et al.
Published: (2026)
by: Gao, Yupeng, et al.
Published: (2026)
3D Scene Change Modeling With Consistent Multi-View Aggregation
by: Zhou, Zirui, et al.
Published: (2025)
by: Zhou, Zirui, et al.
Published: (2025)
Advanced Feature Manipulation for Enhanced Change Detection Leveraging Natural Language Models
by: Li, Zhenglin, et al.
Published: (2024)
by: Li, Zhenglin, et al.
Published: (2024)
Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning
by: Tu, Yunbin, et al.
Published: (2024)
by: Tu, Yunbin, et al.
Published: (2024)
JoVA: Unified Multimodal Learning for Joint Video-Audio Generation
by: Huang, Xiaohu, et al.
Published: (2025)
by: Huang, Xiaohu, et al.
Published: (2025)
Enhancing Perception of Key Changes in Remote Sensing Image Change Captioning
by: Yang, Cong, et al.
Published: (2024)
by: Yang, Cong, et al.
Published: (2024)
3D-SSM: A Novel 3D Selective Scan Module for Remote Sensing Change Detection
by: Huang, Rui, et al.
Published: (2025)
by: Huang, Rui, et al.
Published: (2025)
ChangingGrounding: 3D Visual Grounding in Changing Scenes
by: Hu, Miao, et al.
Published: (2025)
by: Hu, Miao, et al.
Published: (2025)
Revisiting Shadow Detection from a Vision-Language Perspective
by: Wang, Yonghui, et al.
Published: (2026)
by: Wang, Yonghui, et al.
Published: (2026)
HiSem: Hierarchical Semantic Disentangling for Remote Sensing Image Change Captioning
by: Wang, Man, et al.
Published: (2026)
by: Wang, Man, et al.
Published: (2026)
Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments
by: Zhu, Liyuan, et al.
Published: (2023)
by: Zhu, Liyuan, et al.
Published: (2023)
CG-MLLM: Captioning and Generating 3D content via Multi-modal Large Language Models
by: Huang, Junming, et al.
Published: (2026)
by: Huang, Junming, et al.
Published: (2026)
CEBSNet: Change-Excited and Background-Suppressed Network with Temporal Dependency Modeling for Bitemporal Change Detection
by: Xu, Qi'ao, et al.
Published: (2025)
by: Xu, Qi'ao, et al.
Published: (2025)
Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance
by: Zhu, Yongshuo, et al.
Published: (2024)
by: Zhu, Yongshuo, et al.
Published: (2024)
SOVC: Subject-Oriented Video Captioning
by: Teng, Chang, et al.
Published: (2023)
by: Teng, Chang, et al.
Published: (2023)
Be the Change You Want to See: Revisiting Remote Sensing Change Detection Practices
by: Rolih, Blaž, et al.
Published: (2025)
by: Rolih, Blaž, et al.
Published: (2025)
Aesthetic Image Captioning with Saliency Enhanced MLLMs
by: Tao, Yilin, et al.
Published: (2025)
by: Tao, Yilin, et al.
Published: (2025)
RSCaMa: Remote Sensing Image Change Captioning with State Space Model
by: Liu, Chenyang, et al.
Published: (2024)
by: Liu, Chenyang, et al.
Published: (2024)
Retrieval-Augmented Egocentric Video Captioning
by: Xu, Jilan, et al.
Published: (2024)
by: Xu, Jilan, et al.
Published: (2024)
Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective
by: Long, Kaifang, et al.
Published: (2024)
by: Long, Kaifang, et al.
Published: (2024)
3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
by: Huang, Xiaohu, et al.
Published: (2025)
by: Huang, Xiaohu, et al.
Published: (2025)
Learning Object State Changes in Videos: An Open-World Perspective
by: Xue, Zihui, et al.
Published: (2023)
by: Xue, Zihui, et al.
Published: (2023)
SegChange-R1: LLM-Augmented Remote Sensing Change Detection
by: Zhou, Fei
Published: (2025)
by: Zhou, Fei
Published: (2025)
Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images
by: Yu, Xiaofei, et al.
Published: (2024)
by: Yu, Xiaofei, et al.
Published: (2024)
Gaussian Difference: Find Any Change Instance in 3D Scenes
by: Jiang, Binbin, et al.
Published: (2025)
by: Jiang, Binbin, et al.
Published: (2025)
Adapting Segment Anything Model for Change Detection in HR Remote Sensing Images
by: Ding, Lei, et al.
Published: (2023)
by: Ding, Lei, et al.
Published: (2023)
A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning
by: Sun, Dongwei, et al.
Published: (2024)
by: Sun, Dongwei, et al.
Published: (2024)
FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition
by: Huang, Xiaohu, et al.
Published: (2024)
by: Huang, Xiaohu, et al.
Published: (2024)
SAM Guided Semantic and Motion Changed Region Mining for Remote Sensing Change Captioning
by: Wang, Futian, et al.
Published: (2025)
by: Wang, Futian, et al.
Published: (2025)
VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation
by: Zhang, Shi-Xue, et al.
Published: (2025)
by: Zhang, Shi-Xue, et al.
Published: (2025)
Streaming Dense Video Captioning
by: Zhou, Xingyi, et al.
Published: (2024)
by: Zhou, Xingyi, et al.
Published: (2024)
Similar Items
-
ChangeViT: Unleashing Plain Vision Transformers for Change Detection
by: Zhu, Duowang, et al.
Published: (2024) -
PruneVid: Visual Token Pruning for Efficient Video Large Language Models
by: Huang, Xiaohu, et al.
Published: (2024) -
Imagine How To Change: Explicit Procedure Modeling for Change Captioning
by: Sun, Jiayang, et al.
Published: (2026) -
MV-CC: Mask Enhanced Video Model for Remote Sensing Change Caption
by: Liu, Ruixun, et al.
Published: (2024) -
Hierarchical Dual-Change Collaborative Learning for UAV Scene Change Captioning
by: Chen, Fuhai, et al.
Published: (2026)