:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shen, Zeyu, Henderson, Peter
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2604.20156
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Closer Look into Mixture-of-Experts in Large Language Models
by: Lo, Ka Man, et al.
Published: (2024)

Extending Puzzle for Mixture-of-Experts Reasoning Models with Application to GPT-OSS Acceleration
by: Bercovich, Akhiad, et al.
Published: (2026)

Scattered Mixture-of-Experts Implementation
by: Tan, Shawn, et al.
Published: (2024)

MoQE: Improve Quantization Model performance via Mixture of Quantization Experts
by: Zhang, Jinhao, et al.
Published: (2025)

TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts
by: Lee, Hyunwook, et al.
Published: (2024)

Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
by: Chu, Kexin, et al.
Published: (2025)

Mixture of Heterogeneous Grouped Experts for Language Modeling
by: Ma, Zhicheng, et al.
Published: (2026)

MC#: Mixture Compressor for Mixture-of-Experts Large Models
by: Huang, Wei, et al.
Published: (2025)

Efficiently Editing Mixture-of-Experts Models with Compressed Experts
by: He, Yifei, et al.
Published: (2025)

Varying-Coefficient Mixture of Experts Model
by: Zhao, Qicheng, et al.
Published: (2026)

Spatial-Temporal Mixture-of-Graph-Experts for Multi-Type Crime Prediction
by: Wu, Ziyang, et al.
Published: (2024)

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
by: Qiu, Zihan, et al.
Published: (2025)

LTLDoG: Satisfying Temporally-Extended Symbolic Constraints for Safe Diffusion-based Planning
by: Feng, Zeyu, et al.
Published: (2024)

Learning More Generalized Experts by Merging Experts in Mixture-of-Experts
by: Park, Sejik
Published: (2024)

Mixture of Experts in Large Language Models
by: Zhang, Danyang, et al.
Published: (2025)

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models
by: Wang, Zihan, et al.
Published: (2025)

AnyExperts: On-Demand Expert Allocation for Multimodal Language Models with Mixture of Expert
by: Gao, Yuting, et al.
Published: (2025)

DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models
by: Aghdam, Maryam Akhavan, et al.
Published: (2024)

LightMoE: Reducing Mixture-of-Experts Redundancy through Expert Replacing
by: Hao, Jiawei, et al.
Published: (2026)

Routing Mamba: Scaling State Space Models with Mixture-of-Experts Projection
by: Zhan, Zheng, et al.
Published: (2025)

Path-Constrained Mixture-of-Experts
by: Gu, Zijin, et al.
Published: (2026)

$μ$-Parametrization for Mixture of Experts
by: Małaśnicki, Jan, et al.
Published: (2025)

Expert Merging in Sparse Mixture of Experts with Nash Bargaining
by: Nguyen, Dung V., et al.
Published: (2025)

The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models
by: Wang, Yan, et al.
Published: (2026)

Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models
by: Belenki, Lior, et al.
Published: (2025)

Mixture of Raytraced Experts
by: Perin, Andrea, et al.
Published: (2025)

Mixture of Lookup Experts
by: Jie, Shibo, et al.
Published: (2025)

Structured Diffusion Models with Mixture of Gaussians as Prior Distribution
by: Jia, Nanshan, et al.
Published: (2024)

Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
by: Nguyen-Nhat, Minh-Khoi, et al.
Published: (2025)

Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks
by: Feng, Zeyu, et al.
Published: (2024)

Mixture of Experts in a Mixture of RL settings
by: Willi, Timon, et al.
Published: (2024)

Bayesian Mixture of Experts For Large Language Models
by: Dialameh, Maryam, et al.
Published: (2025)

Differentially Private Training of Mixture of Experts Models
by: Tholoniat, Pierre, et al.
Published: (2024)

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
by: Tang, Anke, et al.
Published: (2024)

Speculating Experts Accelerates Inference for Mixture-of-Experts
by: Madan, Vivan, et al.
Published: (2026)

Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion
by: Tang, Anke, et al.
Published: (2024)

Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models
by: Liang, Jingcong, et al.
Published: (2025)

Robustness of Mixtures of Experts to Feature Noise
by: Sun, Dong, et al.
Published: (2026)

Generalizing GNNs with Tokenized Mixture of Experts
by: Guo, Xiaoguang, et al.
Published: (2026)

Hyperparameter Transfer with Mixture-of-Expert Layers
by: Jiang, Tianze, et al.
Published: (2026)