:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dwivedi, Chaitanya, Huang, Binxuan, Gupta, Himanshu, Jayarao, Pratik, Varshney, Neeraj, Yin, Bing
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.19835
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Explicit Reasoning Makes Better Judges: A Systematic Study on Accuracy, Efficiency, and Robustness
by: Jayarao, Pratik, et al.
Published: (2025)

Code Mixologist : A Practitioner's Guide to Building Code-Mixed LLMs
by: Gupta, Himanshu, et al.
Published: (2026)

MoIN: Mixture of Introvert Experts to Upcycle an LLM
by: Tejankar, Ajinkya, et al.
Published: (2024)

Upcycling Large Language Models into Mixture of Experts
by: He, Ethan, et al.
Published: (2024)

Efficiently Editing Mixture-of-Experts Models with Compressed Experts
by: He, Yifei, et al.
Published: (2025)

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
by: Nakamura, Taishi, et al.
Published: (2025)

UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
by: Huang, Minbin, et al.
Published: (2026)

Dynamic Mixture of Experts Against Severe Distribution Shifts
by: Kim, Donghu
Published: (2025)

MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
by: Jin, Peng, et al.
Published: (2024)

Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging
by: Hui, Tingfeng, et al.
Published: (2024)

AnyExperts: On-Demand Expert Allocation for Multimodal Language Models with Mixture of Expert
by: Gao, Yuting, et al.
Published: (2025)

Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline
by: Fang, Zhiyuan, et al.
Published: (2025)

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
by: Lu, Xudong, et al.
Published: (2024)

Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
by: Yan, Jiaming, et al.
Published: (2025)

MoLEx: Mixture of Layer Experts for Finetuning with Sparse Upcycling
by: Teo, Rachel S. Y., et al.
Published: (2025)

Speculating Experts Accelerates Inference for Mixture-of-Experts
by: Madan, Vivan, et al.
Published: (2026)

Less is More: Undertraining Experts Improves Model Upcycling
by: Horoi, Stefan, et al.
Published: (2025)

Shift Happens: Mixture of Experts based Continual Adaptation in Federated Learning
by: Bhope, Rahul Atul, et al.
Published: (2025)

MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
by: Yang, Cheng, et al.
Published: (2024)

Mixture of Raytraced Experts
by: Perin, Andrea, et al.
Published: (2025)

MC#: Mixture Compressor for Mixture-of-Experts Large Models
by: Huang, Wei, et al.
Published: (2025)

Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees
by: Chowdhury, Mohammed Nowaz Rabbani, et al.
Published: (2026)

Multi-Head Mixture-of-Experts
by: Wu, Xun, et al.
Published: (2024)

XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
by: Ding, Yifeng, et al.
Published: (2024)

Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
by: Chu, Kexin, et al.
Published: (2025)

Mixture of Concept Bottleneck Experts
by: De Santis, Francesco, et al.
Published: (2026)

Mixture of Diverse Size Experts
by: Sun, Manxi, et al.
Published: (2024)

Sparsity and Superposition in Mixture of Experts
by: Chaudhari, Marmik, et al.
Published: (2025)

Mixture of A Million Experts
by: He, Xu Owen
Published: (2024)

Mixture of Experts in a Mixture of RL settings
by: Willi, Timon, et al.
Published: (2024)

TT-LoRA MoE: Unifying Parameter-Efficient Fine-Tuning and Sparse Mixture-of-Experts
by: Kunwar, Pradip, et al.
Published: (2025)

Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts
by: Nguyen, Xuan-Phi, et al.
Published: (2026)

Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
by: Nguyen-Nhat, Minh-Khoi, et al.
Published: (2025)

Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts
by: Yan, Fanqi, et al.
Published: (2024)

Exploring Expert Specialization through Unsupervised Training in Sparse Mixture of Experts
by: Nikolic, Strahinja, et al.
Published: (2025)

MoEMeta: Mixture-of-Experts Meta Learning for Few-Shot Relational Learning
by: Wu, Han, et al.
Published: (2025)

Graph Knowledge Distillation to Mixture of Experts
by: Rumiantsev, Pavel, et al.
Published: (2024)

Mixture of Weak & Strong Experts on Graphs
by: Zeng, Hanqing, et al.
Published: (2023)

Mixture of Experts in Large Language Models
by: Zhang, Danyang, et al.
Published: (2025)

Theory on Mixture-of-Experts in Continual Learning
by: Li, Hongbo, et al.
Published: (2024)