Saved in:
| Main Authors: | Zhu, Jichao, Yu, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.02414 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Compound Expression Recognition via Multi Model Ensemble
by: Yu, Jun, et al.
Published: (2024)
by: Yu, Jun, et al.
Published: (2024)
Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition
by: Li, Qifei, et al.
Published: (2024)
by: Li, Qifei, et al.
Published: (2024)
Enhancing Multimodal Unified Representations for Cross Modal Generalization
by: Huang, Hai, et al.
Published: (2024)
by: Huang, Hai, et al.
Published: (2024)
Multimodal Emotion Recognition with Vision-language Prompting and Modality Dropout
by: QI, Anbin, et al.
Published: (2024)
by: QI, Anbin, et al.
Published: (2024)
Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations
by: Kim, Jeonghyeon, et al.
Published: (2025)
by: Kim, Jeonghyeon, et al.
Published: (2025)
Multimodal Video Emotion Recognition with Reliable Reasoning Priors
by: Wang, Zhepeng, et al.
Published: (2025)
by: Wang, Zhepeng, et al.
Published: (2025)
Leveraging Retrieval Augment Approach for Multimodal Emotion Recognition Under Missing Modalities
by: Fan, Qi, et al.
Published: (2024)
by: Fan, Qi, et al.
Published: (2024)
Unsupervised Audio-Visual Segmentation with Modality Alignment
by: Bhosale, Swapnil, et al.
Published: (2024)
by: Bhosale, Swapnil, et al.
Published: (2024)
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models
by: Yu, Xiaomin, et al.
Published: (2026)
by: Yu, Xiaomin, et al.
Published: (2026)
Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices
by: Pan, Baiyu, et al.
Published: (2024)
by: Pan, Baiyu, et al.
Published: (2024)
MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention
by: Wang, Tianyi, et al.
Published: (2025)
by: Wang, Tianyi, et al.
Published: (2025)
Gramian Multimodal Representation Learning and Alignment
by: Cicchetti, Giordano, et al.
Published: (2024)
by: Cicchetti, Giordano, et al.
Published: (2024)
MIRe: Enhancing Multimodal Queries Representation via Fusion-Free Modality Interaction for Multimodal Retrieval
by: Ju, Yeong-Joon, et al.
Published: (2024)
by: Ju, Yeong-Joon, et al.
Published: (2024)
Learning Relative Representations for Fine-Grained Multimodal Alignment with Limited Data
by: Kim, Shiwon, et al.
Published: (2026)
by: Kim, Shiwon, et al.
Published: (2026)
Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation
by: Liang, Yupu, et al.
Published: (2025)
by: Liang, Yupu, et al.
Published: (2025)
Beyond CLIP: Knowledge-Enhanced Multimodal Transformers for Cross-Modal Alignment in Diabetic Retinopathy Diagnosis
by: Samanta, Argha Kamal, et al.
Published: (2025)
by: Samanta, Argha Kamal, et al.
Published: (2025)
EEG-based Multimodal Representation Learning for Emotion Recognition
by: Yin, Kang, et al.
Published: (2024)
by: Yin, Kang, et al.
Published: (2024)
Taxonomy-Aware Representation Alignment for Hierarchical Visual Recognition with Large Multimodal Models
by: He, Hulingxiao, et al.
Published: (2026)
by: He, Hulingxiao, et al.
Published: (2026)
Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling
by: Yu, Jun, et al.
Published: (2024)
by: Yu, Jun, et al.
Published: (2024)
Multimodal Sentiment Analysis based on Multi-channel and Symmetric Mutual Promotion Feature Fusion
by: Zhu, Wangyuan, et al.
Published: (2026)
by: Zhu, Wangyuan, et al.
Published: (2026)
ECMF: Enhanced Cross-Modal Fusion for Multimodal Emotion Recognition in MER-SEMI Challenge
by: Hu, Juewen, et al.
Published: (2025)
by: Hu, Juewen, et al.
Published: (2025)
AttAnchor: Guiding Cross-Modal Token Alignment in VLMs with Attention Anchors
by: Zhang, Junyang, et al.
Published: (2025)
by: Zhang, Junyang, et al.
Published: (2025)
Solution to the 10th ABAW Expression Recognition Challenge: A Robust Multimodal Framework with Safe Cross-Attention and Modality Dropout
by: Yu, Jun, et al.
Published: (2026)
by: Yu, Jun, et al.
Published: (2026)
A Novel Approach to for Multimodal Emotion Recognition : Multimodal semantic information fusion
by: Dai, Wei, et al.
Published: (2025)
by: Dai, Wei, et al.
Published: (2025)
Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models
by: Hao, Jitai, et al.
Published: (2025)
by: Hao, Jitai, et al.
Published: (2025)
Emotion-LLaMAv2 and MMEVerse: A New Framework and Benchmark for Multimodal Emotion Understanding
by: Peng, Xiaojiang, et al.
Published: (2026)
by: Peng, Xiaojiang, et al.
Published: (2026)
Learning Modality-Aware Representations: Adaptive Group-wise Interaction Network for Multimodal MRI Synthesis
by: Song, Tao, et al.
Published: (2024)
by: Song, Tao, et al.
Published: (2024)
From Coarse to Nuanced: Cross-Modal Alignment of Fine-Grained Linguistic Cues and Visual Salient Regions for Dynamic Emotion Recognition
by: Liu, Yu, et al.
Published: (2025)
by: Liu, Yu, et al.
Published: (2025)
MANTA: Cross-Modal Semantic Alignment and Information-Theoretic Optimization for Long-form Multimodal Understanding
by: Zhong, Ziqi, et al.
Published: (2025)
by: Zhong, Ziqi, et al.
Published: (2025)
CodeBind: Decoupled Representation Learning for Multimodal Alignment with Unified Compositional Codebook
by: Chen, Zeyu, et al.
Published: (2026)
by: Chen, Zeyu, et al.
Published: (2026)
MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models
by: Zhang, Fan, et al.
Published: (2025)
by: Zhang, Fan, et al.
Published: (2025)
Multimodal Prototype Alignment for Semi-supervised Pathology Image Segmentation
by: Fu, Mingxi, et al.
Published: (2025)
by: Fu, Mingxi, et al.
Published: (2025)
Beyond Cross-Modal Alignment: Measuring and Leveraging Modality Gap in Vision-Language Models
by: Yan, Hanqi, et al.
Published: (2025)
by: Yan, Hanqi, et al.
Published: (2025)
MLVTG: Mamba-Based Feature Alignment and LLM-Driven Purification for Multi-Modal Video Temporal Grounding
by: Zhu, Zhiyi, et al.
Published: (2025)
by: Zhu, Zhiyi, et al.
Published: (2025)
Nano-EmoX: Unifying Multimodal Emotional Intelligence from Perception to Empathy
by: Huang, Jiahao, et al.
Published: (2026)
by: Huang, Jiahao, et al.
Published: (2026)
Time Series, Vision, and Language: Exploring the Limits of Alignment in Contrastive Representation Spaces
by: Yashwante, Pratham, et al.
Published: (2026)
by: Yashwante, Pratham, et al.
Published: (2026)
CorMulT: A Semi-supervised Modality Correlation-aware Multimodal Transformer for Sentiment Analysis
by: Li, Yangmin, et al.
Published: (2024)
by: Li, Yangmin, et al.
Published: (2024)
Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation
by: Wang, Zhenbin, et al.
Published: (2024)
by: Wang, Zhenbin, et al.
Published: (2024)
Align Your Query: Representation Alignment for Multimodality Medical Object Detection
by: Seo, Ara, et al.
Published: (2025)
by: Seo, Ara, et al.
Published: (2025)
MIBench: Evaluating LMMs on Multimodal Interaction
by: Miao, Yu, et al.
Published: (2026)
by: Miao, Yu, et al.
Published: (2026)
Similar Items
-
Compound Expression Recognition via Multi Model Ensemble
by: Yu, Jun, et al.
Published: (2024) -
Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition
by: Li, Qifei, et al.
Published: (2024) -
Enhancing Multimodal Unified Representations for Cross Modal Generalization
by: Huang, Hai, et al.
Published: (2024) -
Multimodal Emotion Recognition with Vision-language Prompting and Modality Dropout
by: QI, Anbin, et al.
Published: (2024) -
Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations
by: Kim, Jeonghyeon, et al.
Published: (2025)