:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ma, Xiang, Xu, Litian, Fang, Lexin, Zhang, Caiming, Cui, Lizhen
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2510.11175
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Bridging the Modality Gap: Dimension Information Alignment and Sparse Spatial Constraint for Image-Text Matching
by: Ma, Xiang, et al.
Published: (2024)

Minding Fuzzy Regions: A Data-driven Alternating Learning Paradigm for Stable Lesion Segmentation
by: Fang, Lexin, et al.
Published: (2025)

Aligning the True Semantics: Constrained Decoupling and Distribution Sampling for Cross-Modal Alignment
by: Ma, Xiang, et al.
Published: (2026)

Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval
by: Li, Hao, et al.
Published: (2023)

Multi-modal Semantic Understanding with Contrastive Cross-modal Feature Alignment
by: Zhang, Ming, et al.
Published: (2024)

CKDA: Cross-modality Knowledge Disentanglement and Alignment for Visible-Infrared Lifelong Person Re-identification
by: Cui, Zhenyu, et al.
Published: (2025)

DAMA: Data- and Model-aware Alignment of Multi-modal LLMs
by: Lu, Jinda, et al.
Published: (2025)

CDPR: Cross-modal Diffusion with Polarization for Reliable Monocular Depth Estimation
by: Yu, Rongjia, et al.
Published: (2026)

Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation
by: Tao, Yitian, et al.
Published: (2024)

Prototypical Progressive Alignment and Reweighting for Generalizable Semantic Segmentation
by: Zhang, Yuhang, et al.
Published: (2025)

Multi-level Cross-modal Alignment for Image Clustering
by: Qiu, Liping, et al.
Published: (2024)

Robust ID-Specific Face Restoration via Alignment Learning
by: Fang, Yushun, et al.
Published: (2025)

SCPNet: Unsupervised Cross-modal Homography Estimation via Intra-modal Self-supervised Learning
by: Zhang, Runmin, et al.
Published: (2024)

Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval
by: Li, Jiaxing, et al.
Published: (2025)

Cross-Modal Prototype Alignment and Mixing for Training-Free Few-Shot Classification
by: Goswami, Dipam, et al.
Published: (2026)

Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning
by: Xu, Lizhen, et al.
Published: (2025)

Towards Domain-Generalized Open-Vocabulary Object Detection: A Progressive Domain-invariant Cross-modal Alignment Method
by: Xu, Xiaoran, et al.
Published: (2026)

Annotations Are Not All You Need: A Cross-modal Knowledge Transfer Network for Unsupervised Temporal Sentence Grounding
by: Fang, Xiang, et al.
Published: (2026)

CrossWeaver: Cross-modal Weaving for Arbitrary-Modality Semantic Segmentation
by: Zhang, Zelin, et al.
Published: (2026)

Unsupervised Domain Adaptation via Similarity-based Prototypes for Cross-Modality Segmentation
by: Ye, Ziyu, et al.
Published: (2025)

Cross-modal Offset-guided Dynamic Alignment and Fusion for Weakly Aligned UAV Object Detection
by: Zongzhen, Liu, et al.
Published: (2025)

e5-omni: Explicit Cross-modal Alignment for Omni-modal Embeddings
by: Chen, Haonan, et al.
Published: (2026)

CAST: Cross-modal Alignment Similarity Test for Vision Language Models
by: Dagan, Gautier, et al.
Published: (2024)

CADFormer: Fine-Grained Cross-modal Alignment and Decoding Transformer for Referring Remote Sensing Image Segmentation
by: Liu, Maofu, et al.
Published: (2025)

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning
by: Panagopoulou, Artemis, et al.
Published: (2023)

Mind the Modality Gap: Towards a Remote Sensing Vision-Language Model via Cross-modal Alignment
by: Zavras, Angelos, et al.
Published: (2024)

Style-Aware Blending and Prototype-Based Cross-Contrast Consistency for Semi-Supervised Medical Image Segmentation
by: Chen, Chaowei, et al.
Published: (2025)

Cross-modal Context-aware Learning for Visual Prompt Guided Multimodal Image Understanding in Remote Sensing
by: Zhang, Xu, et al.
Published: (2025)

Cross-modal Fuzzy Alignment Network for Text-Aerial Person Retrieval and A Large-scale Benchmark
by: Deng, Yifei, et al.
Published: (2026)

Cross-Resolution SAR Target Detection Using Structural Hierarchy Adaptation and Reliable Adjacency Alignment
by: Qin, Jiang, et al.
Published: (2025)

Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer
by: Liu, Jiaming, et al.
Published: (2022)

Enhanced Cross-modal 3D Retrieval via Tri-modal Reconstruction
by: Ren, Junlong, et al.
Published: (2025)

InfScene-SR: Arbitrary-Size Image Super-Resolution via Iterative Joint-Denoising
by: Sun, Shoukun, et al.
Published: (2026)

Cross-modal Full-mode Fine-grained Alignment for Text-to-Image Person Retrieval
by: Yin, Hao, et al.
Published: (2025)

AsyncBEV: Cross-modal Flow Alignment in Asynchronous 3D Object Detection
by: Wang, Shiming, et al.
Published: (2026)

Iterative Definition Refinement for Zero-Shot Classification via LLM-Based Semantic Prototype Optimization
by: Rehmat, Naeem, et al.
Published: (2026)

Active Diffusion Matching: Score-based Iterative Alignment of Cross-Modal Retinal Images
by: Lee, Kanggeon, et al.
Published: (2026)

Cross-modal Prompting for Balanced Incomplete Multi-modal Emotion Recognition
by: He, Wen-Jue, et al.
Published: (2025)

Deep Unfolding Multi-modal Image Fusion Network via Attribution Analysis
by: Bai, Haowen, et al.
Published: (2025)

Learnable Cross-modal Knowledge Distillation for Multi-modal Learning with Missing Modality
by: Wang, Hu, et al.
Published: (2023)