Saved in:
| Main Authors: | Zhao, Ziyu, Zhu, Tong, Zhang, Zhi, Fan, Tiantian, Yang, Jinluan, Kuang, Kun, Wei, Zhongyu, Wu, Fei, Cheng, Yu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.15521 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning
by: Zhao, Ziyu, et al.
Published: (2025)
by: Zhao, Ziyu, et al.
Published: (2025)
Steering MoE LLMs via Expert (De)Activation
by: Fayyaz, Mohsen, et al.
Published: (2025)
by: Fayyaz, Mohsen, et al.
Published: (2025)
ExpertWeave: Efficiently Serving Expert-Specialized Fine-Tuned Adapters at Scale
by: Shi, Ge, et al.
Published: (2025)
by: Shi, Ge, et al.
Published: (2025)
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
by: Sun, Weigao, et al.
Published: (2025)
by: Sun, Weigao, et al.
Published: (2025)
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
by: Wu, Haoyuan, et al.
Published: (2025)
by: Wu, Haoyuan, et al.
Published: (2025)
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling
by: Zhang, Jihai, et al.
Published: (2024)
by: Zhang, Jihai, et al.
Published: (2024)
BEAM: Binary Expert Activation Masking for Dynamic Routing in MoE
by: Wu, Juntong, et al.
Published: (2026)
by: Wu, Juntong, et al.
Published: (2026)
Elastic MoE: Unlocking the Inference-Time Scalability of Mixture-of-Experts
by: Gu, Naibin, et al.
Published: (2025)
by: Gu, Naibin, et al.
Published: (2025)
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
by: Zhu, Xingkui, et al.
Published: (2024)
by: Zhu, Xingkui, et al.
Published: (2024)
Exploiting the Experts: Unauthorized Compression in MoE-LLMs
by: Neogi, Pinaki Prasad Guha, et al.
Published: (2025)
by: Neogi, Pinaki Prasad Guha, et al.
Published: (2025)
Discovering Invariant Neighborhood Patterns for Heterophilic Graphs
by: Yang, Jinluan, et al.
Published: (2024)
by: Yang, Jinluan, et al.
Published: (2024)
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs
by: Chen, Xiaodong, et al.
Published: (2025)
by: Chen, Xiaodong, et al.
Published: (2025)
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
by: Zhu, Tong, et al.
Published: (2024)
by: Zhu, Tong, et al.
Published: (2024)
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert Merging
by: Li, Lujun, et al.
Published: (2025)
by: Li, Lujun, et al.
Published: (2025)
Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers
by: Lau, Tim Tsz-Kit, et al.
Published: (2026)
by: Lau, Tim Tsz-Kit, et al.
Published: (2026)
Unifying Adversarial Perturbation for Graph Neural Networks
by: Yang, Jinluan, et al.
Published: (2025)
by: Yang, Jinluan, et al.
Published: (2025)
MoE-Loco: Mixture of Experts for Multitask Locomotion
by: Huang, Runhan, et al.
Published: (2025)
by: Huang, Runhan, et al.
Published: (2025)
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
by: Jin, Peng, et al.
Published: (2024)
by: Jin, Peng, et al.
Published: (2024)
Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis
by: Pei, Zehua, et al.
Published: (2025)
by: Pei, Zehua, et al.
Published: (2025)
EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
by: Qian, Yulei, et al.
Published: (2024)
by: Qian, Yulei, et al.
Published: (2024)
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
by: Li, Yunxin, et al.
Published: (2024)
by: Li, Yunxin, et al.
Published: (2024)
Do Domain-specific Experts exist in MoE-based LLMs?
by: Do, Giang, et al.
Published: (2026)
by: Do, Giang, et al.
Published: (2026)
Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference
by: Liu, Baihui, et al.
Published: (2026)
by: Liu, Baihui, et al.
Published: (2026)
MoE-Beyond: Learning-Based Expert Activation Prediction on Edge Devices
by: Gavhane, Nishant, et al.
Published: (2025)
by: Gavhane, Nishant, et al.
Published: (2025)
TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts
by: Xu, Yu, et al.
Published: (2026)
by: Xu, Yu, et al.
Published: (2026)
OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale
by: Shi, Jingze, et al.
Published: (2026)
by: Shi, Jingze, et al.
Published: (2026)
Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation
by: Zheng, Youwei, et al.
Published: (2025)
by: Zheng, Youwei, et al.
Published: (2025)
Noise Projection: Closing the Prompt-Agnostic Gap Behind Text-to-Image Misalignment in Diffusion Models
by: Tong, Yunze, et al.
Published: (2025)
by: Tong, Yunze, et al.
Published: (2025)
OD-MoE: On-Demand Expert Loading for Cacheless Edge-Distributed MoE Inference
by: Wang, Liujianfu, et al.
Published: (2025)
by: Wang, Liujianfu, et al.
Published: (2025)
Harder Tasks Need More Experts: Dynamic Routing in MoE Models
by: Huang, Quzhe, et al.
Published: (2024)
by: Huang, Quzhe, et al.
Published: (2024)
CoMoE: Collaborative Optimization of Expert Aggregation and Offloading for MoE-based LLMs at Edge
by: Li, Muqing, et al.
Published: (2025)
by: Li, Muqing, et al.
Published: (2025)
SD-MoE: Spectral Decomposition for Effective Expert Specialization
by: Huang, Ruijun, et al.
Published: (2026)
by: Huang, Ruijun, et al.
Published: (2026)
Expert Divergence Learning for MoE-based Language Models
by: Li, Jiaang, et al.
Published: (2026)
by: Li, Jiaang, et al.
Published: (2026)
Fair-MoE: Fairness-Oriented Mixture of Experts in Vision-Language Models
by: Wang, Peiran, et al.
Published: (2025)
by: Wang, Peiran, et al.
Published: (2025)
MergeMoE: Efficient Compression of MoE Models via Expert Output Merging
by: Miao, Ruijie, et al.
Published: (2025)
by: Miao, Ruijie, et al.
Published: (2025)
Synergistic Intra- and Cross-Layer Regularization Losses for MoE Expert Specialization
by: Hu, Rizhen, et al.
Published: (2026)
by: Hu, Rizhen, et al.
Published: (2026)
GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs
by: Deng, Jianing, et al.
Published: (2026)
by: Deng, Jianing, et al.
Published: (2026)
Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
by: Zhou, Xin, et al.
Published: (2024)
by: Zhou, Xin, et al.
Published: (2024)
Advancing Expert Specialization for Better MoE
by: Guo, Hongcan, et al.
Published: (2025)
by: Guo, Hongcan, et al.
Published: (2025)
Horseshoe Mixtures-of-Experts (HS-MoE)
by: Polson, Nick, et al.
Published: (2026)
by: Polson, Nick, et al.
Published: (2026)
Similar Items
-
Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning
by: Zhao, Ziyu, et al.
Published: (2025) -
Steering MoE LLMs via Expert (De)Activation
by: Fayyaz, Mohsen, et al.
Published: (2025) -
ExpertWeave: Efficiently Serving Expert-Specialized Fine-Tuned Adapters at Scale
by: Shi, Ge, et al.
Published: (2025) -
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
by: Sun, Weigao, et al.
Published: (2025) -
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
by: Wu, Haoyuan, et al.
Published: (2025)