Saved in:
| Main Authors: | Dwivedi, Chaitanya, Huang, Binxuan, Gupta, Himanshu, Jayarao, Pratik, Varshney, Neeraj, Yin, Bing |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.19835 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Explicit Reasoning Makes Better Judges: A Systematic Study on Accuracy, Efficiency, and Robustness
by: Jayarao, Pratik, et al.
Published: (2025)
by: Jayarao, Pratik, et al.
Published: (2025)
Code Mixologist : A Practitioner's Guide to Building Code-Mixed LLMs
by: Gupta, Himanshu, et al.
Published: (2026)
by: Gupta, Himanshu, et al.
Published: (2026)
MoIN: Mixture of Introvert Experts to Upcycle an LLM
by: Tejankar, Ajinkya, et al.
Published: (2024)
by: Tejankar, Ajinkya, et al.
Published: (2024)
Upcycling Large Language Models into Mixture of Experts
by: He, Ethan, et al.
Published: (2024)
by: He, Ethan, et al.
Published: (2024)
Efficiently Editing Mixture-of-Experts Models with Compressed Experts
by: He, Yifei, et al.
Published: (2025)
by: He, Yifei, et al.
Published: (2025)
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
by: Nakamura, Taishi, et al.
Published: (2025)
by: Nakamura, Taishi, et al.
Published: (2025)
UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
by: Huang, Minbin, et al.
Published: (2026)
by: Huang, Minbin, et al.
Published: (2026)
Dynamic Mixture of Experts Against Severe Distribution Shifts
by: Kim, Donghu
Published: (2025)
by: Kim, Donghu
Published: (2025)
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
by: Jin, Peng, et al.
Published: (2024)
by: Jin, Peng, et al.
Published: (2024)
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging
by: Hui, Tingfeng, et al.
Published: (2024)
by: Hui, Tingfeng, et al.
Published: (2024)
AnyExperts: On-Demand Expert Allocation for Multimodal Language Models with Mixture of Expert
by: Gao, Yuting, et al.
Published: (2025)
by: Gao, Yuting, et al.
Published: (2025)
Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline
by: Fang, Zhiyuan, et al.
Published: (2025)
by: Fang, Zhiyuan, et al.
Published: (2025)
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
by: Lu, Xudong, et al.
Published: (2024)
by: Lu, Xudong, et al.
Published: (2024)
Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
by: Yan, Jiaming, et al.
Published: (2025)
by: Yan, Jiaming, et al.
Published: (2025)
MoLEx: Mixture of Layer Experts for Finetuning with Sparse Upcycling
by: Teo, Rachel S. Y., et al.
Published: (2025)
by: Teo, Rachel S. Y., et al.
Published: (2025)
Speculating Experts Accelerates Inference for Mixture-of-Experts
by: Madan, Vivan, et al.
Published: (2026)
by: Madan, Vivan, et al.
Published: (2026)
Less is More: Undertraining Experts Improves Model Upcycling
by: Horoi, Stefan, et al.
Published: (2025)
by: Horoi, Stefan, et al.
Published: (2025)
Shift Happens: Mixture of Experts based Continual Adaptation in Federated Learning
by: Bhope, Rahul Atul, et al.
Published: (2025)
by: Bhope, Rahul Atul, et al.
Published: (2025)
MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
by: Yang, Cheng, et al.
Published: (2024)
by: Yang, Cheng, et al.
Published: (2024)
Mixture of Raytraced Experts
by: Perin, Andrea, et al.
Published: (2025)
by: Perin, Andrea, et al.
Published: (2025)
MC#: Mixture Compressor for Mixture-of-Experts Large Models
by: Huang, Wei, et al.
Published: (2025)
by: Huang, Wei, et al.
Published: (2025)
Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees
by: Chowdhury, Mohammed Nowaz Rabbani, et al.
Published: (2026)
by: Chowdhury, Mohammed Nowaz Rabbani, et al.
Published: (2026)
Multi-Head Mixture-of-Experts
by: Wu, Xun, et al.
Published: (2024)
by: Wu, Xun, et al.
Published: (2024)
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
by: Ding, Yifeng, et al.
Published: (2024)
by: Ding, Yifeng, et al.
Published: (2024)
Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
by: Chu, Kexin, et al.
Published: (2025)
by: Chu, Kexin, et al.
Published: (2025)
Mixture of Concept Bottleneck Experts
by: De Santis, Francesco, et al.
Published: (2026)
by: De Santis, Francesco, et al.
Published: (2026)
Mixture of Diverse Size Experts
by: Sun, Manxi, et al.
Published: (2024)
by: Sun, Manxi, et al.
Published: (2024)
Sparsity and Superposition in Mixture of Experts
by: Chaudhari, Marmik, et al.
Published: (2025)
by: Chaudhari, Marmik, et al.
Published: (2025)
Mixture of A Million Experts
by: He, Xu Owen
Published: (2024)
by: He, Xu Owen
Published: (2024)
Mixture of Experts in a Mixture of RL settings
by: Willi, Timon, et al.
Published: (2024)
by: Willi, Timon, et al.
Published: (2024)
TT-LoRA MoE: Unifying Parameter-Efficient Fine-Tuning and Sparse Mixture-of-Experts
by: Kunwar, Pradip, et al.
Published: (2025)
by: Kunwar, Pradip, et al.
Published: (2025)
Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts
by: Nguyen, Xuan-Phi, et al.
Published: (2026)
by: Nguyen, Xuan-Phi, et al.
Published: (2026)
Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
by: Nguyen-Nhat, Minh-Khoi, et al.
Published: (2025)
by: Nguyen-Nhat, Minh-Khoi, et al.
Published: (2025)
Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts
by: Yan, Fanqi, et al.
Published: (2024)
by: Yan, Fanqi, et al.
Published: (2024)
Exploring Expert Specialization through Unsupervised Training in Sparse Mixture of Experts
by: Nikolic, Strahinja, et al.
Published: (2025)
by: Nikolic, Strahinja, et al.
Published: (2025)
MoEMeta: Mixture-of-Experts Meta Learning for Few-Shot Relational Learning
by: Wu, Han, et al.
Published: (2025)
by: Wu, Han, et al.
Published: (2025)
Graph Knowledge Distillation to Mixture of Experts
by: Rumiantsev, Pavel, et al.
Published: (2024)
by: Rumiantsev, Pavel, et al.
Published: (2024)
Mixture of Weak & Strong Experts on Graphs
by: Zeng, Hanqing, et al.
Published: (2023)
by: Zeng, Hanqing, et al.
Published: (2023)
Mixture of Experts in Large Language Models
by: Zhang, Danyang, et al.
Published: (2025)
by: Zhang, Danyang, et al.
Published: (2025)
Theory on Mixture-of-Experts in Continual Learning
by: Li, Hongbo, et al.
Published: (2024)
by: Li, Hongbo, et al.
Published: (2024)
Similar Items
-
Explicit Reasoning Makes Better Judges: A Systematic Study on Accuracy, Efficiency, and Robustness
by: Jayarao, Pratik, et al.
Published: (2025) -
Code Mixologist : A Practitioner's Guide to Building Code-Mixed LLMs
by: Gupta, Himanshu, et al.
Published: (2026) -
MoIN: Mixture of Introvert Experts to Upcycle an LLM
by: Tejankar, Ajinkya, et al.
Published: (2024) -
Upcycling Large Language Models into Mixture of Experts
by: He, Ethan, et al.
Published: (2024) -
Efficiently Editing Mixture-of-Experts Models with Compressed Experts
by: He, Yifei, et al.
Published: (2025)