:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Hao Mark, Mo, Zhiwen, Lee, Royson, Wang, Qianzhou, Li, Da, Hu, Shell Xu, Luk, Wayne, Hospedales, Timothy, Fan, Hongxiang
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.00879
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
by: Chen, Hao Mark, et al.
Published: (2025)

Model Diffusion for Certifiable Few-shot Transfer Learning
by: Rezk, Fady, et al.
Published: (2025)

Feed-Forward Latent Domain Adaptation
by: Bohdal, Ondrej, et al.
Published: (2022)

MobileQuant: Mobile-friendly Quantization for On-device Language Models
by: Tan, Fuwen, et al.
Published: (2024)

FastTTS: Accelerating Test-Time Scaling for Edge LLM Reasoning
by: Chen, Hao Mark, et al.
Published: (2025)

Recurrent Early Exits for Federated Learning with Heterogeneous Clients
by: Lee, Royson, et al.
Published: (2024)

A Bayesian Approach to Data Point Selection
by: Xu, Xinnuo, et al.
Published: (2024)

FedP$^2$EFT: Federated Learning to Personalize PEFT for Multilingual LLMs
by: Lee, Royson, et al.
Published: (2025)

Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)

Enhancing LLM-based Quantum Code Generation with Multi-Agent Optimization and Quantum Error Correction
by: Campbell, Charlie, et al.
Published: (2025)

DeepStack: Scalable and Accurate Design Space Exploration for Distributed 3D-Stacked AI Accelerators
by: Mo, Zhiwen, et al.
Published: (2026)

Enhancing Trustworthiness with Mixed Precision: Benchmarks, Opportunities, and Challenges
by: Lu, Guanxi, et al.
Published: (2025)

Hardware-Aware Neural Dropout Search for Reliable Uncertainty Prediction on FPGA
by: Zhang, Zehuan, et al.
Published: (2024)

Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time
by: Han, Yixuan, et al.
Published: (2025)

Progressive Mixed-Precision Decoding for Efficient LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)

HD-MoE: Hybrid and Dynamic Parallelism for Mixture-of-Expert LLMs with 3D Near-Memory Processing
by: Huang, Haochen, et al.
Published: (2025)

CLUES: Collaborative High-Quality Data Selection for LLMs via Training Dynamics
by: Zhao, Wanru, et al.
Published: (2025)

Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
by: Cai, Weilin, et al.
Published: (2024)

ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning
by: Chavhan, Ruchika, et al.
Published: (2024)

Dynamic Adaptive Shared Experts with Grouped Multi-Head Attention Mixture of Experts
by: Li, Cheng, et al.
Published: (2025)

Accelerating 3D Gaussian Splatting with Neural Sorting and Axis-Oriented Rasterization
by: Wang, Zhican, et al.
Published: (2025)

Towards Building Private LLMs: Exploring Multi-Node Expert Parallelism on Apple Silicon for Mixture-of-Experts Large Language Model
by: Chen, Mu-Chi, et al.
Published: (2025)

Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts
by: Nguyen, Xuan-Phi, et al.
Published: (2026)

TradExpert: Revolutionizing Trading with Mixture of Expert LLMs
by: Ding, Qianggang, et al.
Published: (2024)

UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
by: Huang, Minbin, et al.
Published: (2026)

FLEx: Personalized Federated Learning for Mixture-of-Experts LLMs via Expert Grafting
by: Liu, Fan, et al.
Published: (2025)

Accelerating MRI Uncertainty Estimation with Mask-based Bayesian Neural Network
by: Zhang, Zehuan, et al.
Published: (2024)

MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
by: Zhu, Ruidong, et al.
Published: (2025)

Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA
by: Chen, Hao Mark, et al.
Published: (2024)

SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?
by: Zhuang, Haomin, et al.
Published: (2024)

Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
by: Cai, Ruisi, et al.
Published: (2024)

SYMI: Efficient Mixture-of-Experts Training via Model and Optimizer State Decoupling
by: Skiadopoulos, Athinagoras, et al.
Published: (2025)

HAP: Hybrid Adaptive Parallelism for Efficient Mixture-of-Experts Inference
by: Lin, Haoran, et al.
Published: (2025)

Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
by: Zhou, Yixiao, et al.
Published: (2025)

LightMoE: Reducing Mixture-of-Experts Redundancy through Expert Replacing
by: Hao, Jiawei, et al.
Published: (2026)

Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
by: Chu, Kexin, et al.
Published: (2025)

From Misclassifications to Outliers: Joint Reliability Assessment in Classification
by: Li, Yang, et al.
Published: (2026)

Mixture of Experts for Low-Resource LLMs
by: Joseph, Ori Bar, et al.
Published: (2026)

Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
by: Chen, Hao Mark, et al.
Published: (2025)

Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
by: Bai, Jun, et al.
Published: (2025)