:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	He, Xu Owen
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2407.04153
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Efficiently Editing Mixture-of-Experts Models with Compressed Experts
by: He, Yifei, et al.
Published: (2025)

Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
by: Yan, Jiaming, et al.
Published: (2025)

Mixture of Raytraced Experts
by: Perin, Andrea, et al.
Published: (2025)

Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts
by: Nguyen, Xuan-Phi, et al.
Published: (2026)

Mixture of Experts in a Mixture of RL settings
by: Willi, Timon, et al.
Published: (2024)

Speculating Experts Accelerates Inference for Mixture-of-Experts
by: Madan, Vivan, et al.
Published: (2026)

Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques
by: He, Shwai, et al.
Published: (2024)

In-depth Analysis on Caching and Pre-fetching in Mixture of Experts Offloading
by: Lin, Shuning, et al.
Published: (2025)

HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
by: Zhao, Hao, et al.
Published: (2024)

MC#: Mixture Compressor for Mixture-of-Experts Large Models
by: Huang, Wei, et al.
Published: (2025)

Mixture of Diverse Size Experts
by: Sun, Manxi, et al.
Published: (2024)

Sparsity and Superposition in Mixture of Experts
by: Chaudhari, Marmik, et al.
Published: (2025)

Mixture of Concept Bottleneck Experts
by: De Santis, Francesco, et al.
Published: (2026)

UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
by: Huang, Minbin, et al.
Published: (2026)

AnyExperts: On-Demand Expert Allocation for Multimodal Language Models with Mixture of Expert
by: Gao, Yuting, et al.
Published: (2025)

RouteHijack: Routing-Aware Attack on Mixture-of-Experts LLMs
by: Xu, Zhiyuan, et al.
Published: (2026)

Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts
by: Dwivedi, Chaitanya, et al.
Published: (2026)

MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts
by: Xie, Zhitian, et al.
Published: (2024)

Graph Knowledge Distillation to Mixture of Experts
by: Rumiantsev, Pavel, et al.
Published: (2024)

Theory on Mixture-of-Experts in Continual Learning
by: Li, Hongbo, et al.
Published: (2024)

Mixture of Weak & Strong Experts on Graphs
by: Zeng, Hanqing, et al.
Published: (2023)

Mixture of Experts in Large Language Models
by: Zhang, Danyang, et al.
Published: (2025)

PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model
by: Liu, Yilun, et al.
Published: (2024)

Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
by: Chu, Kexin, et al.
Published: (2025)

Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts
by: Yan, Fanqi, et al.
Published: (2024)

MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
by: Jin, Peng, et al.
Published: (2024)

Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
by: Nguyen-Nhat, Minh-Khoi, et al.
Published: (2025)

Exploring Expert Specialization through Unsupervised Training in Sparse Mixture of Experts
by: Nikolic, Strahinja, et al.
Published: (2025)

MixtureKit: A General Framework for Composing, Training, and Visualizing Mixture-of-Experts Models
by: Chamma, Ahmad, et al.
Published: (2025)

HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
by: He, Neil, et al.
Published: (2025)

MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance
by: Hu, Xing, et al.
Published: (2025)

OrdMoE: Preference Alignment via Hierarchical Expert Group Ranking in Multimodal Mixture-of-Experts LLMs
by: Gao, Yuting, et al.
Published: (2025)

Mixture of Latent Experts Using Tensor Products
by: Su, Zhan, et al.
Published: (2024)

Mixture-of-Experts Meets In-Context Reinforcement Learning
by: Wu, Wenhao, et al.
Published: (2025)

Wavelet Mixture of Experts for Time Series Forecasting
by: Zhou, Zheng, et al.
Published: (2025)

\$OneMillion-Bench: How Far are Language Agents from Human Experts?
by: Yang, Qianyu, et al.
Published: (2026)

Upcycling Large Language Models into Mixture of Experts
by: He, Ethan, et al.
Published: (2024)

Dynamic Adaptive Shared Experts with Grouped Multi-Head Attention Mixture of Experts
by: Li, Cheng, et al.
Published: (2025)

How Many Experts Are Enough? Towards Optimal Semantic Specialization for Mixture-of-Experts
by: Park, Sumin, et al.
Published: (2025)

LightMoE: Reducing Mixture-of-Experts Redundancy through Expert Replacing
by: Hao, Jiawei, et al.
Published: (2026)