Saved in:
| Main Authors: | Qin, Jiajun, Pu, Yuan, He, Zhuolun, Kim, Seunggeun, Pan, David Z., Yu, Bei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.11815 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
UniMMAD: Unified Multi-Modal and Multi-Class Anomaly Detection via MoE-Driven Feature Decompression
by: Zhao, Yuan, et al.
Published: (2025)
by: Zhao, Yuan, et al.
Published: (2025)
UniHDA: A Unified and Versatile Framework for Multi-Modal Hybrid Domain Adaptation
by: Li, Hengjia, et al.
Published: (2024)
by: Li, Hengjia, et al.
Published: (2024)
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation
by: Li, Teng, et al.
Published: (2025)
by: Li, Teng, et al.
Published: (2025)
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions
by: Zhang, Guozhen, et al.
Published: (2025)
by: Zhang, Guozhen, et al.
Published: (2025)
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution
by: Du, Shian, et al.
Published: (2025)
by: Du, Shian, et al.
Published: (2025)
UniSOT: A Unified Framework for Multi-Modality Single Object Tracking
by: Ma, Yinchao, et al.
Published: (2025)
by: Ma, Yinchao, et al.
Published: (2025)
PROMISE: Prompt-Attentive Hierarchical Contrastive Learning for Robust Cross-Modal Representation with Missing Modalities
by: Chen, Jiajun, et al.
Published: (2025)
by: Chen, Jiajun, et al.
Published: (2025)
RLBind: Adversarial-Invariant Cross-Modal Alignment for Unified Robust Embeddings
by: Lu, Yuhong
Published: (2025)
by: Lu, Yuhong
Published: (2025)
Robust Self-Paced Hashing for Cross-Modal Retrieval with Noisy Labels
by: Pu, Ruitao, et al.
Published: (2025)
by: Pu, Ruitao, et al.
Published: (2025)
Adversarial Robustness for Unified Multi-Modal Encoders via Efficient Calibration
by: Liao, Chih-Ting, et al.
Published: (2025)
by: Liao, Chih-Ting, et al.
Published: (2025)
UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation
by: Zhao, Xiaoqi, et al.
Published: (2025)
by: Zhao, Xiaoqi, et al.
Published: (2025)
MoRA: LoRA Guided Multi-Modal Disease Diagnosis with Missing Modality
by: Shi, Zhiyi, et al.
Published: (2024)
by: Shi, Zhiyi, et al.
Published: (2024)
UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments
by: Su, Dayong, et al.
Published: (2025)
by: Su, Dayong, et al.
Published: (2025)
Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting
by: Chi, Seunggeun, et al.
Published: (2025)
by: Chi, Seunggeun, et al.
Published: (2025)
Multi-Modal Generative Embedding Model
by: Ma, Feipeng, et al.
Published: (2024)
by: Ma, Feipeng, et al.
Published: (2024)
XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association
by: Fang, Zhihua, et al.
Published: (2025)
by: Fang, Zhihua, et al.
Published: (2025)
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval
by: Huang, Hailang, et al.
Published: (2024)
by: Huang, Hailang, et al.
Published: (2024)
UniRoute: Unified Routing Mixture-of-Experts for Modality-Adaptive Remote Sensing Change Detection
by: Shu, Qingling, et al.
Published: (2026)
by: Shu, Qingling, et al.
Published: (2026)
Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models
by: Hao, Jitai, et al.
Published: (2025)
by: Hao, Jitai, et al.
Published: (2025)
Modality Unified Attack for Omni-Modality Person Re-Identification
by: Bian, Yuan, et al.
Published: (2025)
by: Bian, Yuan, et al.
Published: (2025)
Unified Multi-Modal Image Synthesis for Missing Modality Imputation
by: Zhang, Yue, et al.
Published: (2023)
by: Zhang, Yue, et al.
Published: (2023)
Pailitao-VL: Unified Embedding and Reranker for Real-Time Multi-Modal Industrial Search
by: Chen, Lei, et al.
Published: (2026)
by: Chen, Lei, et al.
Published: (2026)
3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering
by: Xu, Rongtao, et al.
Published: (2025)
by: Xu, Rongtao, et al.
Published: (2025)
RingMo-Agent: A Unified Remote Sensing Foundation Model for Multi-Platform and Multi-Modal Reasoning
by: Hu, Huiyang, et al.
Published: (2025)
by: Hu, Huiyang, et al.
Published: (2025)
RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation
by: Bi, Hanbo, et al.
Published: (2025)
by: Bi, Hanbo, et al.
Published: (2025)
Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
by: Su, Yongyi, et al.
Published: (2025)
by: Su, Yongyi, et al.
Published: (2025)
Large Motion Model for Unified Multi-Modal Motion Generation
by: Zhang, Mingyuan, et al.
Published: (2024)
by: Zhang, Mingyuan, et al.
Published: (2024)
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
by: Huang, Jiehui, et al.
Published: (2025)
by: Huang, Jiehui, et al.
Published: (2025)
Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations
by: Jeong, Minoh, et al.
Published: (2024)
by: Jeong, Minoh, et al.
Published: (2024)
Meta-Learned Modality-Weighted Knowledge Distillation for Robust Multi-Modal Learning with Missing Data
by: Wang, Hu, et al.
Published: (2024)
by: Wang, Hu, et al.
Published: (2024)
CHARM: Collaborative Harmonization across Arbitrary Modalities for Modality-agnostic Semantic Segmentation
by: Wen, Lekang, et al.
Published: (2025)
by: Wen, Lekang, et al.
Published: (2025)
COMMA: Co-Articulated Multi-Modal Learning
by: Hu, Lianyu, et al.
Published: (2023)
by: Hu, Lianyu, et al.
Published: (2023)
UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalities
by: Khattak, Muhammad Uzair, et al.
Published: (2024)
by: Khattak, Muhammad Uzair, et al.
Published: (2024)
Unified Open-World Segmentation with Multi-Modal Prompts
by: Liu, Yang, et al.
Published: (2025)
by: Liu, Yang, et al.
Published: (2025)
MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection
by: Nezakati, Niki, et al.
Published: (2024)
by: Nezakati, Niki, et al.
Published: (2024)
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
by: Chen, Liyang, et al.
Published: (2025)
by: Chen, Liyang, et al.
Published: (2025)
BriMA: Bridged Modality Adaptation for Multi-Modal Continual Action Quality Assessment
by: Zhou, Kanglei, et al.
Published: (2026)
by: Zhou, Kanglei, et al.
Published: (2026)
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
by: Zhao, Qingcheng, et al.
Published: (2024)
by: Zhao, Qingcheng, et al.
Published: (2024)
UniBEV: Multi-modal 3D Object Detection with Uniform BEV Encoders for Robustness against Missing Sensor Modalities
by: Wang, Shiming, et al.
Published: (2023)
by: Wang, Shiming, et al.
Published: (2023)
MoCoLSK: Modality Conditioned High-Resolution Downscaling for Land Surface Temperature
by: Dai, Qun, et al.
Published: (2024)
by: Dai, Qun, et al.
Published: (2024)
Similar Items
-
UniMMAD: Unified Multi-Modal and Multi-Class Anomaly Detection via MoE-Driven Feature Decompression
by: Zhao, Yuan, et al.
Published: (2025) -
UniHDA: A Unified and Versatile Framework for Multi-Modal Hybrid Domain Adaptation
by: Li, Hengjia, et al.
Published: (2024) -
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation
by: Li, Teng, et al.
Published: (2025) -
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions
by: Zhang, Guozhen, et al.
Published: (2025) -
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution
by: Du, Shian, et al.
Published: (2025)