:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhao, Ziyu, Zhu, Tong, Zhang, Zhi, Fan, Tiantian, Yang, Jinluan, Kuang, Kun, Wei, Zhongyu, Wu, Fei, Cheng, Yu
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2602.15521
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning
by: Zhao, Ziyu, et al.
Published: (2025)

Steering MoE LLMs via Expert (De)Activation
by: Fayyaz, Mohsen, et al.
Published: (2025)

ExpertWeave: Efficiently Serving Expert-Specialized Fine-Tuned Adapters at Scale
by: Shi, Ge, et al.
Published: (2025)

Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
by: Sun, Weigao, et al.
Published: (2025)

Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
by: Wu, Haoyuan, et al.
Published: (2025)

CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling
by: Zhang, Jihai, et al.
Published: (2024)

BEAM: Binary Expert Activation Masking for Dynamic Routing in MoE
by: Wu, Juntong, et al.
Published: (2026)

Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
by: Gu, Naibin, et al.
Published: (2025)

MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
by: Zhu, Xingkui, et al.
Published: (2024)

Exploiting the Experts: Unauthorized Compression in MoE-LLMs
by: Neogi, Pinaki Prasad Guha, et al.
Published: (2025)

Discovering Invariant Neighborhood Patterns for Heterophilic Graphs
by: Yang, Jinluan, et al.
Published: (2024)

MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs
by: Chen, Xiaodong, et al.
Published: (2025)

LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
by: Zhu, Tong, et al.
Published: (2024)

Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert Merging
by: Li, Lujun, et al.
Published: (2025)

Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers
by: Lau, Tim Tsz-Kit, et al.
Published: (2026)

Unifying Adversarial Perturbation for Graph Neural Networks
by: Yang, Jinluan, et al.
Published: (2025)

MoE-Loco: Mixture of Experts for Multitask Locomotion
by: Huang, Runhan, et al.
Published: (2025)

MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
by: Jin, Peng, et al.
Published: (2024)

Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis
by: Pei, Zehua, et al.
Published: (2025)

EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
by: Qian, Yulei, et al.
Published: (2024)

Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
by: Li, Yunxin, et al.
Published: (2024)

Do Domain-specific Experts exist in MoE-based LLMs?
by: Do, Giang, et al.
Published: (2026)

Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference
by: Liu, Baihui, et al.
Published: (2026)

MoE-Beyond: Learning-Based Expert Activation Prediction on Edge Devices
by: Gavhane, Nishant, et al.
Published: (2025)

TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts
by: Xu, Yu, et al.
Published: (2026)

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale
by: Shi, Jingze, et al.
Published: (2026)

Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation
by: Zheng, Youwei, et al.
Published: (2025)

Noise Projection: Closing the Prompt-Agnostic Gap Behind Text-to-Image Misalignment in Diffusion Models
by: Tong, Yunze, et al.
Published: (2025)

OD-MoE: On-Demand Expert Loading for Cacheless Edge-Distributed MoE Inference
by: Wang, Liujianfu, et al.
Published: (2025)

Harder Tasks Need More Experts: Dynamic Routing in MoE Models
by: Huang, Quzhe, et al.
Published: (2024)

CoMoE: Collaborative Optimization of Expert Aggregation and Offloading for MoE-based LLMs at Edge
by: Li, Muqing, et al.
Published: (2025)

SD-MoE: Spectral Decomposition for Effective Expert Specialization
by: Huang, Ruijun, et al.
Published: (2026)

Expert Divergence Learning for MoE-based Language Models
by: Li, Jiaang, et al.
Published: (2026)

Fair-MoE: Fairness-Oriented Mixture of Experts in Vision-Language Models
by: Wang, Peiran, et al.
Published: (2025)

MergeMoE: Efficient Compression of MoE Models via Expert Output Merging
by: Miao, Ruijie, et al.
Published: (2025)

Synergistic Intra- and Cross-Layer Regularization Losses for MoE Expert Specialization
by: Hu, Rizhen, et al.
Published: (2026)

GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs
by: Deng, Jianing, et al.
Published: (2026)

Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
by: Zhou, Xin, et al.
Published: (2024)

Advancing Expert Specialization for Better MoE
by: Guo, Hongcan, et al.
Published: (2025)

Horseshoe Mixtures-of-Experts (HS-MoE)
by: Polson, Nick, et al.
Published: (2026)