:: Library Catalog

Image de couverture de livre

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Jin, Can, Peng, Hongwu, Xiang, Mingcan, Zhang, Qixin, Yuan, Xiangchi, Hasan, Amit, Dibua, Ohiremen, Gong, Yifan, Kang, Yan, Metaxas, Dimitris N.
Format:	Preprint
Publié:	2025
Sujets:	Artificial Intelligence
Accès en ligne:	https://arxiv.org/abs/2512.13996
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert Cache
par: Xue, Leyang, et autres
Publié: (2024)

Sparsity is Combinatorial Depth: Quantifying MoE Expressivity via Tropical Geometry
par: Su, Ye, et autres
Publié: (2026)

Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity
par: Tang, Yehui, et autres
Publié: (2025)

MoE-Prefill: Zero Redundancy Overheads in MoE Prefill Serving
par: Su, Zhaoyuan, et autres
Publié: (2026)

MoE-PHDS: One MoE checkpoint for flexible runtime sparsity
par: Hannah, Lauren. A, et autres
Publié: (2025)

MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism
par: Zhang, Zheng, et autres
Publié: (2025)

LLaDA-MoE: A Sparse MoE Diffusion Language Model
par: Zhu, Fengqi, et autres
Publié: (2025)

MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs
par: Cao, Shiyi, et autres
Publié: (2024)

Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
par: Wu, Haoyuan, et autres
Publié: (2025)

EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
par: Qian, Yulei, et autres
Publié: (2024)

GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory
par: Wu, Haoze, et autres
Publié: (2024)

Sigma-MoE-Tiny Technical Report
par: Hu, Qingguo, et autres
Publié: (2025)

Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs
par: Tang, Yehui, et autres
Publié: (2025)

Deconstructing Pre-training: Knowledge Attribution Analysis in MoE and Dense Models
par: Wang, Bo, et autres
Publié: (2026)

Improving MoE Compute Efficiency by Composing Weight and Data Sparsity
par: Kilian, Maciej, et autres
Publié: (2026)

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale
par: Shi, Jingze, et autres
Publié: (2026)

MoE-Sieve: Routing-Guided LoRA for Efficient MoE Fine-Tuning
par: Manzoni, Andrea
Publié: (2026)

LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing
par: Nie, Xiaonan, et autres
Publié: (2024)

SP-MoE: Speculative Decoding and Prefetching for Accelerating MoE-based Model Inference
par: Chen, Liangkun, et autres
Publié: (2025)

OD-MoE: On-Demand Expert Loading for Cacheless Edge-Distributed MoE Inference
par: Wang, Liujianfu, et autres
Publié: (2025)

Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
par: Zeng, Zhiyuan, et autres
Publié: (2024)

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning
par: Dong, Daize, et autres
Publié: (2026)

SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training
par: Tang, Shengkun, et autres
Publié: (2026)

Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation
par: Zheng, Youwei, et autres
Publié: (2025)

BIG-MoE: Bypass Isolated Gating MoE for Generalized Multimodal Face Anti-Spoofing
par: Ma, Yingjie, et autres
Publié: (2024)

GRACE-MoE: Grouping and Replication with Locality-Aware Routing for Efficient Distributed MoE Inference
par: Han, Yu, et autres
Publié: (2025)

MoE-GPS: Guidlines for Prediction Strategy for Dynamic Expert Duplication in MoE Load Balancing
par: Ma, Haiyue, et autres
Publié: (2025)

GazeFormer-MoE: Context-Aware Gaze Estimation via CLIP and MoE Transformer
par: Zhao, Xinyuan, et autres
Publié: (2026)

MoE-Compression: How the Compression Error of Experts Affects the Inference Accuracy of MoE Model?
par: Ma, Songkai, et autres
Publié: (2025)

MiM-DiT: MoE in MoE with Diffusion Transformers for All-in-One Image Restoration
par: Kong, Lingshun, et autres
Publié: (2026)

ECG-MoE: Mixture-of-Expert Electrocardiogram Foundation Model
par: Xu, Yuhao, et autres
Publié: (2026)

GRIN: GRadient-INformed MoE
par: Liu, Liyuan, et autres
Publié: (2024)

FFT-MoE: Efficient Federated Fine-Tuning for Foundation Models via Large-scale Sparse MoE under Heterogeneous Edge
par: Hu, Gang, et autres
Publié: (2025)

MoE-Gen: High-Throughput MoE Inference on a Single GPU with Module-Based Batching
par: Xu, Tairan, et autres
Publié: (2025)

AIMER: Calibration-Free Task-Agnostic MoE Pruning
par: Liu, Zongfang, et autres
Publié: (2026)

Symphony-MoE: Harmonizing Disparate Pre-trained Models into a Coherent Mixture-of-Experts
par: Wang, Qi, et autres
Publié: (2025)

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
par: Chen, Junyi, et autres
Publié: (2023)

Multi-Layer Scheduling for MoE-Based LLM Reasoning
par: Sun, Yifan, et autres
Publié: (2026)

Continual Pre-training of MoEs: How robust is your router?
par: Thérien, Benjamin, et autres
Publié: (2025)

Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
par: Wu, Chenpeng, et autres
Publié: (2025)