Saved in:
| Main Authors: | Tu, Zhilin, Li, Kemou, Li, Fengpeng, Fei, Jianwei, Zhang, Jiamin, Wu, Haiwei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.21939 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
State-Anchored Complete-View Distillation for Robust Conversational Multimodal Emotion Recognition
by: Pan, Zhaoyan, et al.
Published: (2026)
by: Pan, Zhaoyan, et al.
Published: (2026)
Multi-MLLM Knowledge Distillation for Out-of-Context News Detection
by: Gu, Yimeng, et al.
Published: (2025)
by: Gu, Yimeng, et al.
Published: (2025)
Contrastive Knowledge Distillation for Robust Multimodal Sentiment Analysis
by: Sang, Zhongyi, et al.
Published: (2024)
by: Sang, Zhongyi, et al.
Published: (2024)
DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain
by: Li, Fengpeng, et al.
Published: (2024)
by: Li, Fengpeng, et al.
Published: (2024)
HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs
by: Zhang, Zijian, et al.
Published: (2025)
by: Zhang, Zijian, et al.
Published: (2025)
Large Language Models (LLMs): Deployment, Tokenomics and Sustainability
by: Dong, Haiwei, et al.
Published: (2024)
by: Dong, Haiwei, et al.
Published: (2024)
Mixture of Disentangled Experts with Missing Modalities for Robust Multimodal Sentiment Analysis
by: Li, Xiang, et al.
Published: (2026)
by: Li, Xiang, et al.
Published: (2026)
Bringing Robots Home: The Rise of AI Robots in Consumer Electronics
by: Dong, Haiwei, et al.
Published: (2024)
by: Dong, Haiwei, et al.
Published: (2024)
MST-Distill: Mixture of Specialized Teachers for Cross-Modal Knowledge Distillation
by: Li, Hui, et al.
Published: (2025)
by: Li, Hui, et al.
Published: (2025)
Private Speech Classification without Collapse: Stabilized DP Training and Offline Distillation
by: Wen, Yadi, et al.
Published: (2026)
by: Wen, Yadi, et al.
Published: (2026)
Robust Multi-generation Learned Compression of Point Cloud Attribute
by: Liu, Xiangzuo, et al.
Published: (2025)
by: Liu, Xiangzuo, et al.
Published: (2025)
Routing Experts: Learning to Route Dynamic Experts in Multi-modal Large Language Models
by: Wu, Qiong, et al.
Published: (2024)
by: Wu, Qiong, et al.
Published: (2024)
Decoupled Audio-Visual Dataset Distillation
by: Li, Wenyuan, et al.
Published: (2025)
by: Li, Wenyuan, et al.
Published: (2025)
RoSMM: A Robust and Secure Multi-Modal Watermarking Framework for Diffusion Models
by: Fang, ZhongLi, et al.
Published: (2025)
by: Fang, ZhongLi, et al.
Published: (2025)
HDA-SELD: Hierarchical Cross-Modal Distillation with Multi-Level Data Augmentation for Low-Resource Audio-Visual Sound Event Localization and Detection
by: Wang, Qing, et al.
Published: (2025)
by: Wang, Qing, et al.
Published: (2025)
SFQA: A Comprehensive Perceptual Quality Assessment Dataset for Singing Face Generation
by: Gao, Zhilin, et al.
Published: (2026)
by: Gao, Zhilin, et al.
Published: (2026)
Ges-QA: A Multidimensional Quality Assessment Dataset for Audio-to-3D Gesture Generation
by: Gao, Zhilin, et al.
Published: (2025)
by: Gao, Zhilin, et al.
Published: (2025)
Tile-Weighted Rate-Distortion Optimized Packet Scheduling for 360$^\circ$ VR Video Streaming
by: Wang, Haopeng, et al.
Published: (2024)
by: Wang, Haopeng, et al.
Published: (2024)
Rethinking Multi-view Representation Learning via Distilled Disentangling
by: Ke, Guanzhou, et al.
Published: (2024)
by: Ke, Guanzhou, et al.
Published: (2024)
TeMTG: Text-Enhanced Multi-Hop Temporal Graph Modeling for Audio-Visual Video Parsing
by: Chen, Yaru, et al.
Published: (2025)
by: Chen, Yaru, et al.
Published: (2025)
Dark Side of Modalities: Reinforced Multimodal Distillation for Multimodal Knowledge Graph Reasoning
by: Zhao, Yu, et al.
Published: (2025)
by: Zhao, Yu, et al.
Published: (2025)
Generalizing Video DeepFake Detection by Self-generated Audio-Visual Pseudo-Fakes
by: Wei, Zihe, et al.
Published: (2026)
by: Wei, Zihe, et al.
Published: (2026)
Ensembling Synchronisation-based and Face-Voice Association Paradigms for Robust Active Speaker Detection in Egocentric Recordings
by: Clarke, Jason, et al.
Published: (2025)
by: Clarke, Jason, et al.
Published: (2025)
Mixture-of-Prompt-Experts for Multi-modal Semantic Understanding
by: Wu, Zichen, et al.
Published: (2024)
by: Wu, Zichen, et al.
Published: (2024)
EmoVLM-KD: Fusing Distilled Expertise with Vision-Language Models for Visual Emotion Analysis
by: Lee, SangEun, et al.
Published: (2025)
by: Lee, SangEun, et al.
Published: (2025)
Hyper-modal Imputation Diffusion Embedding with Dual-Distillation for Federated Multimodal Knowledge Graph Completion
by: Zhang, Ying, et al.
Published: (2025)
by: Zhang, Ying, et al.
Published: (2025)
Enhancing Automatic Chord Recognition via Pseudo-Labeling and Knowledge Distillation
by: Phan, Nghia, et al.
Published: (2026)
by: Phan, Nghia, et al.
Published: (2026)
Efficient Low-Resolution Face Recognition via Bridge Distillation
by: Ge, Shiming, et al.
Published: (2024)
by: Ge, Shiming, et al.
Published: (2024)
Robust Modality-incomplete Anomaly Detection: A Modality-instructive Framework with Benchmark
by: Miao, Bingchen, et al.
Published: (2024)
by: Miao, Bingchen, et al.
Published: (2024)
Noise-Conditioned Mixture-of-Experts Framework for Robust Speaker Verification
by: Gu, Bin, et al.
Published: (2025)
by: Gu, Bin, et al.
Published: (2025)
M6: Multi-generator, Multi-domain, Multi-lingual and cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases
by: Li, Yupei, et al.
Published: (2024)
by: Li, Yupei, et al.
Published: (2024)
SFE-Net: Harnessing Biological Principles of Differential Gene Expression for Improved Feature Selection in Deep Learning Networks
by: Li, Yuqi, et al.
Published: (2024)
by: Li, Yuqi, et al.
Published: (2024)
Enhancing Image-Text Matching with Adaptive Feature Aggregation
by: Wang, Zuhui, et al.
Published: (2024)
by: Wang, Zuhui, et al.
Published: (2024)
Learning Contrastive Self-Distillation for Ultra-Fine-Grained Visual Categorization Targeting Limited Samples
by: Fang, Ziye, et al.
Published: (2023)
by: Fang, Ziye, et al.
Published: (2023)
MoLEx: Mixture of LoRA Experts in Speech Self-Supervised Models for Audio Deepfake Detection
by: Pan, Zihan, et al.
Published: (2025)
by: Pan, Zihan, et al.
Published: (2025)
How to Cache Important Contents for Multi-modal Service in Dynamic Networks: A DRL-based Caching Scheme
by: Zhang, Zhe, et al.
Published: (2024)
by: Zhang, Zhe, et al.
Published: (2024)
Distilling Implicit Multimodal Knowledge into Large Language Models for Zero-Resource Dialogue Generation
by: Zhang, Bo, et al.
Published: (2024)
by: Zhang, Bo, et al.
Published: (2024)
Multi Agents Semantic Emotion Aligned Music to Image Generation with Music Derived Captions
by: Shi, Junchang, et al.
Published: (2025)
by: Shi, Junchang, et al.
Published: (2025)
Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition
by: Zhang, Junzheng, et al.
Published: (2024)
by: Zhang, Junzheng, et al.
Published: (2024)
CDI-DTI: A Strong Cross-domain Interpretable Drug-Target Interaction Prediction Framework Based on Multi-Strategy Fusion
by: Li, Xiangyu, et al.
Published: (2025)
by: Li, Xiangyu, et al.
Published: (2025)
Similar Items
-
State-Anchored Complete-View Distillation for Robust Conversational Multimodal Emotion Recognition
by: Pan, Zhaoyan, et al.
Published: (2026) -
Multi-MLLM Knowledge Distillation for Out-of-Context News Detection
by: Gu, Yimeng, et al.
Published: (2025) -
Contrastive Knowledge Distillation for Robust Multimodal Sentiment Analysis
by: Sang, Zhongyi, et al.
Published: (2024) -
DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain
by: Li, Fengpeng, et al.
Published: (2024) -
HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs
by: Zhang, Zijian, et al.
Published: (2025)