:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shao, Hang, Gao, Heting, Shen, Yunhang, Chen, Jiawei, Long, Zuwei, Yang, Dong, Li, Ke, Sun, Xing
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2506.21864
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
by: Li, Lijiang, et al.
Published: (2026)

VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model
by: Long, Zuwei, et al.
Published: (2025)

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
by: Fu, Chaoyou, et al.
Published: (2025)

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
by: Wang, Xiong, et al.
Published: (2024)

LUCY: Linguistic Understanding and Control Yielding Early Stage of Her
by: Gao, Heting, et al.
Published: (2025)

VITA: Towards Open-Source Interactive Omni Multimodal LLM
by: Fu, Chaoyou, et al.
Published: (2024)

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction
by: Zhang, Haonan, et al.
Published: (2025)

LLaMA-Omni: Seamless Speech Interaction with Large Language Models
by: Fang, Qingkai, et al.
Published: (2024)

Omni-Referring Image Segmentation
by: Zheng, Qiancheng, et al.
Published: (2025)

MoE-Hub: Taming Software Complexity for Seamless MoE Overlap with Hardware-Accelerated Communication on Multi-GPU Systems
by: Zhou, Zhuoshan, et al.
Published: (2026)

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale
by: Shi, Jingze, et al.
Published: (2026)

VEQ: Modality-Adaptive Quantization for MoE Vision-Language Models
by: Qin, Guangshuo, et al.
Published: (2026)

GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
by: Wu, Haoze, et al.
Published: (2024)

Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data
by: Li, Yunxin, et al.
Published: (2025)

FlexiReID: Adaptive Mixture of Expert for Multi-Modal Person Re-Identification
by: Sun, Zhen, et al.
Published: (2025)

OmniGAIA: Towards Native Omni-Modal AI Agents
by: Li, Xiaoxi, et al.
Published: (2026)

Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
by: Wu, Haoyuan, et al.
Published: (2025)

Towards Unsupervised Speech Recognition Without Pronunciation Models
by: Ni, Junrui, et al.
Published: (2024)

SMoES: Soft Modality-Guided Expert Specialization in MoE-VLMs
by: Bo, Zi-Hao, et al.
Published: (2026)

MiM-DiT: MoE in MoE with Diffusion Transformers for All-in-One Image Restoration
by: Kong, Lingshun, et al.
Published: (2026)

MoTAS: MoE-Guided Feature Selection from TTS-Augmented Speech for Enhanced Multimodal Alzheimer's Early Screening
by: Shao, Yongqi, et al.
Published: (2025)

MoE3D: Mixture of Experts meets Multi-Modal 3D Understanding
by: Li, Yu, et al.
Published: (2025)

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
by: Chen, Junyi, et al.
Published: (2023)

MoE-Loco: Mixture of Experts for Multitask Locomotion
by: Huang, Runhan, et al.
Published: (2025)

iSchools and Non-iSchools in the USA: An Examination of Their Master's Programs
by: Chu, Heting
Published: (2012)

Hyperlinks: How Well Do They Represent the Intellectual Content of Digital Collections?
by: Chu, Heting
Published: (1997)

Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts
by: Yun, Sukwon, et al.
Published: (2024)

Is Extending Modality The Right Path Towards Omni-Modality?
by: Zhu, Tinghui, et al.
Published: (2025)

BIG-MoE: Bypass Isolated Gating MoE for Generalized Multimodal Face Anti-Spoofing
by: Ma, Yingjie, et al.
Published: (2024)

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
by: Chen, Qian, et al.
Published: (2025)

KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
by: Xu, Zukang, et al.
Published: (2026)

InterMoE: Individual-Specific 3D Human Interaction Generation via Dynamic Temporal-Selective MoE
by: Wang, Lipeng, et al.
Published: (2025)

Omni-DeepSearch: A Benchmark for Audio-Driven Omni-Modal Deep Search
by: Yu, Tao, et al.
Published: (2026)

I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts
by: Xin, Jiayi, et al.
Published: (2025)

Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs
by: Li, Bo, et al.
Published: (2026)

OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering
by: Jia, Yiduo, et al.
Published: (2026)

AST: Adaptive, Seamless, and Training-Free Precise Speech Editing
by: Lv, Sihan, et al.
Published: (2026)

VA-MoE: Variables-Adaptive Mixture of Experts for Incremental Weather Forecasting
by: Chen, Hao, et al.
Published: (2024)

GRIN: GRadient-INformed MoE
by: Liu, Liyuan, et al.
Published: (2024)

Can Unified Generation and Understanding Models Maintain Semantic Equivalence Across Different Output Modalities?
by: Jiang, Hongbo, et al.
Published: (2026)