:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Do, Giang, Le, Hung, Tran, Truyen
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2604.05267
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

S2MoE: Robust Sparse Mixture of Experts via Stochastic Learning
by: Do, Giang, et al.
Published: (2025)

Rethinking Sparse Mixture of Experts from a Unified Perspective
by: Do, Giang, et al.
Published: (2025)

SimSMoE: Solving Representational Collapse via Similarity Measure
by: Do, Giang, et al.
Published: (2024)

Eigenvectors of Experts are Training-free Non-collapsing Routers
by: Do, Giang, et al.
Published: (2026)

On the Role of Discrete Representation in Sparse Mixture of Experts
by: Do, Giang, et al.
Published: (2024)

Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
by: Wu, Haoyuan, et al.
Published: (2025)

Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory
by: Le, Hung, et al.
Published: (2024)

Steering MoE LLMs via Expert (De)Activation
by: Fayyaz, Mohsen, et al.
Published: (2025)

EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
by: Qian, Yulei, et al.
Published: (2024)

What Gets Activated: Uncovering Domain and Driver Experts in MoE Language Models
by: Hu, Guimin, et al.
Published: (2026)

Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
by: Li, Yunxin, et al.
Published: (2024)

GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs
by: Deng, Jianing, et al.
Published: (2026)

MH-MoE: Multi-Head Mixture-of-Experts
by: Huang, Shaohan, et al.
Published: (2024)

$\infty$-MoE: Generalizing Mixture of Experts to Infinite Experts
by: Takashiro, Shota, et al.
Published: (2026)

ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns
by: Zhao, Ziyu, et al.
Published: (2026)

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale
by: Shi, Jingze, et al.
Published: (2026)

Advancing Expert Specialization for Better MoE
by: Guo, Hongcan, et al.
Published: (2025)

Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models
by: Jiang, Songtao, et al.
Published: (2024)

Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
by: Yue, Tongtian, et al.
Published: (2024)

MoE-Prism: Disentangling Monolithic Experts for Elastic MoE Services via Model-System Co-Designs
by: Xia, Xinfeng, et al.
Published: (2025)

SEER-MoE: Sparse Expert Efficiency through Regularization for Mixture-of-Experts
by: Muzio, Alexandre, et al.
Published: (2024)

Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity
by: Tang, Yehui, et al.
Published: (2025)

Dynamic Expert Specialization: Towards Catastrophic Forgetting-Free Multi-Domain MoE Adaptation
by: Li, Junzhuo, et al.
Published: (2025)

Progressive Multi-granular Alignments for Grounded Reasoning in Large Vision-Language Models
by: Le, Quang-Hung, et al.
Published: (2024)

ROMER: Expert Replacement and Router Calibration for Robust MoE LLMs on Analog Compute-in-Memory Systems
by: Zhou, Wenyong, et al.
Published: (2026)

Leave It to the Experts: Detecting Knowledge Distillation via MoE Expert Signatures
by: Li, Pingzhi, et al.
Published: (2025)

Expert Selections In MoE Models Reveal (Almost) As Much As Text
by: Nuriyev, Amir, et al.
Published: (2026)

BLR-MoE: Boosted Language-Routing Mixture of Experts for Domain-Robust Multilingual E2E ASR
by: Ma, Guodong, et al.
Published: (2025)

SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation
by: Li, Zichong, et al.
Published: (2025)

MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning
by: Yang, Shu, et al.
Published: (2024)

Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs
by: Tang, Yehui, et al.
Published: (2025)

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning
by: Li, Zongqian, et al.
Published: (2025)

Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
by: Gu, Naibin, et al.
Published: (2025)

Harder Tasks Need More Experts: Dynamic Routing in MoE Models
by: Huang, Quzhe, et al.
Published: (2024)

Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks
by: Chernov, Andrei
Published: (2025)

GMoE: Empowering LLMs Fine-Tuning via MoE Graph Collaboration
by: Bai, Ting, et al.
Published: (2024)

LLaDA-MoE: A Sparse MoE Diffusion Language Model
by: Zhu, Fengqi, et al.
Published: (2025)

CP-MoE: Consistency-Preserving Mixture-of-Experts for Continual Learning
by: Liu, Yang, et al.
Published: (2026)

Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
by: Sun, Weigao, et al.
Published: (2025)

Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference
by: Liu, Baihui, et al.
Published: (2026)