Saved in:
| Main Authors: | Wang, Jiaxi, Hu, Wenhui, Liu, Xueyang, Wu, Beihu, Qiu, Yuting, Cai, YingYing |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2312.17648 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CrossFlowDG: Bridging the Modality Gap with Cross-modal Flow Matching for Domain Generalization
by: Kritikos, Antonios, et al.
Published: (2026)
by: Kritikos, Antonios, et al.
Published: (2026)
Learnable Cross-modal Knowledge Distillation for Multi-modal Learning with Missing Modality
by: Wang, Hu, et al.
Published: (2023)
by: Wang, Hu, et al.
Published: (2023)
Asymmetric Cross-Modal Knowledge Distillation: Bridging Modalities with Weak Semantic Consistency
by: Wei, Riling, et al.
Published: (2025)
by: Wei, Riling, et al.
Published: (2025)
Bridging the Intent Gap: Knowledge-Enhanced Visual Generation
by: Cheng, Yi, et al.
Published: (2024)
by: Cheng, Yi, et al.
Published: (2024)
Bridging the Gap in Missing Modalities: Leveraging Knowledge Distillation and Style Matching for Brain Tumor Segmentation
by: Zhu, Shenghao, et al.
Published: (2025)
by: Zhu, Shenghao, et al.
Published: (2025)
Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-Identification
by: Liang, Tengfei, et al.
Published: (2023)
by: Liang, Tengfei, et al.
Published: (2023)
Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior
by: Wu, Haitao, et al.
Published: (2025)
by: Wu, Haitao, et al.
Published: (2025)
SeCG: Semantic-Enhanced 3D Visual Grounding via Cross-modal Graph Attention
by: Xiao, Feng, et al.
Published: (2024)
by: Xiao, Feng, et al.
Published: (2024)
Multi-Modal LLM based Image Captioning in ICT: Bridging the Gap Between General and Industry Domain
by: Chao, Lianying, et al.
Published: (2026)
by: Chao, Lianying, et al.
Published: (2026)
DriveXQA: Cross-modal Visual Question Answering for Adverse Driving Scene Understanding
by: Tao, Mingzhe, et al.
Published: (2026)
by: Tao, Mingzhe, et al.
Published: (2026)
KG-ViP: Bridging Knowledge Grounding and Visual Perception in Multi-modal LLMs for Visual Question Answering
by: Li, Zhiyang, et al.
Published: (2026)
by: Li, Zhiyang, et al.
Published: (2026)
Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation
by: Zheng, Xu, et al.
Published: (2024)
by: Zheng, Xu, et al.
Published: (2024)
Non-target Divergence Hypothesis: Toward Understanding Domain Gaps in Cross-Modal Knowledge Distillation
by: Chen, Yilong, et al.
Published: (2024)
by: Chen, Yilong, et al.
Published: (2024)
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
by: Wang, Le, et al.
Published: (2025)
by: Wang, Le, et al.
Published: (2025)
Adaptive Perception for Unified Visual Multi-modal Object Tracking
by: Hu, Xiantao, et al.
Published: (2025)
by: Hu, Xiantao, et al.
Published: (2025)
Mind the Modality Gap: Towards a Remote Sensing Vision-Language Model via Cross-modal Alignment
by: Zavras, Angelos, et al.
Published: (2024)
by: Zavras, Angelos, et al.
Published: (2024)
Bridging Ears and Eyes: Analyzing Audio and Visual Large Language Models to Humans in Visible Sound Recognition and Reducing Their Sensory Gap via Cross-Modal Distillation
by: Jiang, Xilin, et al.
Published: (2025)
by: Jiang, Xilin, et al.
Published: (2025)
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
by: Mistretta, Marco, et al.
Published: (2025)
by: Mistretta, Marco, et al.
Published: (2025)
Fusion-then-Distillation: Toward Cross-modal Positive Distillation for Domain Adaptive 3D Semantic Segmentation
by: Wu, Yao, et al.
Published: (2024)
by: Wu, Yao, et al.
Published: (2024)
Bridging the Gap between Multi-focus and Multi-modal: A Focused Integration Framework for Multi-modal Image Fusion
by: Li, Xilai, et al.
Published: (2023)
by: Li, Xilai, et al.
Published: (2023)
MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification
by: Zhao, Yujian, et al.
Published: (2025)
by: Zhao, Yujian, et al.
Published: (2025)
Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation
by: Lu, Andong, et al.
Published: (2024)
by: Lu, Andong, et al.
Published: (2024)
Multi-level Cross-modal Alignment for Image Clustering
by: Qiu, Liping, et al.
Published: (2024)
by: Qiu, Liping, et al.
Published: (2024)
Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix
by: Liu, Peng
Published: (2021)
by: Liu, Peng
Published: (2021)
CrossWeaver: Cross-modal Weaving for Arbitrary-Modality Semantic Segmentation
by: Zhang, Zelin, et al.
Published: (2026)
by: Zhang, Zelin, et al.
Published: (2026)
Bridging the Semantic-Action Gap in Visual Token Pruning for Efficient VLA Inference
by: Liu, Ziyan, et al.
Published: (2025)
by: Liu, Ziyan, et al.
Published: (2025)
Phrase Decoupling Cross-Modal Hierarchical Matching and Progressive Position Correction for Visual Grounding
by: Xie, Minghong, et al.
Published: (2024)
by: Xie, Minghong, et al.
Published: (2024)
Visual Grounding with Multi-modal Conditional Adaptation
by: Yao, Ruilin, et al.
Published: (2024)
by: Yao, Ruilin, et al.
Published: (2024)
Revisiting Cross-Architecture Distillation: Adaptive Dual-Teacher Transfer for Lightweight Video Models
by: Peng, Ying, et al.
Published: (2025)
by: Peng, Ying, et al.
Published: (2025)
SIGMA: Bridging Structural and Distributional Gaps for Vision Foundation Model Adaptation
by: Xiong, Lingyu, et al.
Published: (2026)
by: Xiong, Lingyu, et al.
Published: (2026)
Lightweight Contrastive Distilled Hashing for Online Cross-modal Retrieval
by: Li, Jiaxing, et al.
Published: (2025)
by: Li, Jiaxing, et al.
Published: (2025)
Dialogue Director: Bridging the Gap in Dialogue Visualization for Multimodal Storytelling
by: Zhang, Min, et al.
Published: (2024)
by: Zhang, Min, et al.
Published: (2024)
Bridging the Inter-Domain Gap through Low-Level Features for Cross-Modal Medical Image Segmentation
by: Lyu, Pengfei, et al.
Published: (2025)
by: Lyu, Pengfei, et al.
Published: (2025)
RGBX-R1: Visual Modality Chain-of-Thought Guided Reinforcement Learning for Multimodal Grounding
by: Wu, Jiahe, et al.
Published: (2026)
by: Wu, Jiahe, et al.
Published: (2026)
S2HPruner: Soft-to-Hard Distillation Bridges the Discretization Gap in Pruning
by: Lin, Weihao, et al.
Published: (2024)
by: Lin, Weihao, et al.
Published: (2024)
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction
by: Li, Leheng, et al.
Published: (2024)
by: Li, Leheng, et al.
Published: (2024)
Bridging Cognitive Gap: Hierarchical Description Learning for Artistic Image Aesthetics Assessment
by: Liu, Henglin, et al.
Published: (2025)
by: Liu, Henglin, et al.
Published: (2025)
DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation
by: He, Jing, et al.
Published: (2024)
by: He, Jing, et al.
Published: (2024)
Multi-modal Generation via Cross-Modal In-Context Learning
by: Kumar, Amandeep, et al.
Published: (2024)
by: Kumar, Amandeep, et al.
Published: (2024)
CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation
by: Govindarajan, Hariprasath, et al.
Published: (2025)
by: Govindarajan, Hariprasath, et al.
Published: (2025)
Similar Items
-
CrossFlowDG: Bridging the Modality Gap with Cross-modal Flow Matching for Domain Generalization
by: Kritikos, Antonios, et al.
Published: (2026) -
Learnable Cross-modal Knowledge Distillation for Multi-modal Learning with Missing Modality
by: Wang, Hu, et al.
Published: (2023) -
Asymmetric Cross-Modal Knowledge Distillation: Bridging Modalities with Weak Semantic Consistency
by: Wei, Riling, et al.
Published: (2025) -
Bridging the Intent Gap: Knowledge-Enhanced Visual Generation
by: Cheng, Yi, et al.
Published: (2024) -
Bridging the Gap in Missing Modalities: Leveraging Knowledge Distillation and Style Matching for Brain Tumor Segmentation
by: Zhu, Shenghao, et al.
Published: (2025)