:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Su, Bor-Yiing, Dykas, Peter, Chrzanowski, Mike, Chhugani, Jatin
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2512.22804
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MoR: Mixture of Ranks for Low-Rank Adaptation Tuning
by: Tang, Chuanyu, et al.
Published: (2024)

Unveiling the Potential of Quantization with MXFP4: Strategies for Quantization Error Reduction
by: Chhugani, Jatin, et al.
Published: (2026)

Methods of improving LLM training stability
by: Rybakov, Oleg, et al.
Published: (2024)

MixGCN: Scalable GCN Training by Mixture of Parallelism and Mixture of Accelerators
by: Wan, Cheng, et al.
Published: (2025)

MoR: Better Handling Diverse Queries with a Mixture of Sparse, Dense, and Human Retrievers
by: Kalra, Jushaan Singh, et al.
Published: (2025)

A Metric Driven Approach to Mixed Precision Training
by: Rasquinha, Mitchelle, et al.
Published: (2024)

PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts
by: Su, Yang, et al.
Published: (2025)

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
by: Lin, Xi Victoria, et al.
Published: (2024)

MP-ISMoE: Mixed-Precision Interactive Side Mixture-of-Experts for Efficient Transfer Learning
by: Zhang, Yutong, et al.
Published: (2026)

APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models
by: Guan, Ziyi, et al.
Published: (2024)

MoNTA: Accelerating Mixture-of-Experts Training with Network-Traffc-Aware Parallel Optimization
by: Guo, Jingming, et al.
Published: (2024)

MoE-DisCo:Low Economy Cost Training Mixture-of-Experts Models
by: Ye, Xin, et al.
Published: (2026)

MoBiQuant: Mixture-of-Bits Quantization for Token-Adaptive Any-Precision LLM
by: Wang, Dongwei, et al.
Published: (2026)

MoST: Mixing Speech and Text with Modality-Aware Mixture of Experts
by: Lou, Yuxuan, et al.
Published: (2026)

MoBiE: Efficient Inference of Mixture of Binary Experts under Post-Training Quantization
by: Zhao, Zhixiong, et al.
Published: (2026)

MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging
by: Wang, Jiapeng, et al.
Published: (2026)

Grouter: Decoupling Routing from Representation for Accelerated MoE Training
by: Xu, Yuqi, et al.
Published: (2026)

What Can You Do When You Have Zero Rewards During RL?
by: Prakash, Jatin, et al.
Published: (2025)

Super Level Sets and Exponential Decay: A Synergistic Approach to Stable Neural Network Training
by: Chaudhary, Jatin, et al.
Published: (2024)

L-MoE: End-to-End Training of a Lightweight Mixture of Low-Rank Adaptation Experts
by: Ji, Shihao, et al.
Published: (2025)

Scale When Needed: Adaptive Neuron-level Mixed Precision Quantization Aware Training
by: Varshney, Ayush K., et al.
Published: (2026)

QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Experts
by: Li, Pingzhi, et al.
Published: (2024)

Training Time Prediction for Mixed Precision-based Distributed Training
by: Kang, Minchul, et al.
Published: (2026)

Pathway-based Progressive Inference (PaPI) for Energy-Efficient Continual Learning
by: Gaurav, Suyash, et al.
Published: (2025)

MoKA: Mixture of Kronecker Adapters
by: Sadeghi, Mohammadreza, et al.
Published: (2025)

Mixture of Experts (MoE): A Big Data Perspective
by: Gan, Wensheng, et al.
Published: (2025)

SDG-MoE: Signed Debate Graph Mixture-of-Experts
by: Kulibaba, Stepan, et al.
Published: (2026)

ProbMoE: Differentiable Probabilistic Routing for Mixture-of-Experts
by: Zhao, Heng, et al.
Published: (2026)

STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization
by: Federici, Marco, et al.
Published: (2025)

Mixed-Precision Federated Learning via Multi-Precision Over-The-Air Aggregation
by: Yuan, Jinsheng, et al.
Published: (2024)

MoWE : A Mixture of Weather Experts
by: Chakraborty, Dibyajyoti, et al.
Published: (2025)

MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
by: Jin, Peng, et al.
Published: (2024)

From Molecules to Mixtures: Learning Representations of Olfactory Mixture Similarity using Inductive Biases
by: Tom, Gary, et al.
Published: (2025)

Score-of-Mixture Training: Training One-Step Generative Models Made Simple via Score Estimation of Mixture Distributions
by: Jayashankar, Tejas, et al.
Published: (2025)

MoBA: Mixture of Block Attention for Long-Context LLMs
by: Lu, Enzhe, et al.
Published: (2025)

MoIN: Mixture of Introvert Experts to Upcycle an LLM
by: Tejankar, Ajinkya, et al.
Published: (2024)

MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models
by: Liu, Wenyuan, et al.
Published: (2025)

Mixed-Precision Quantization for Language Models: Techniques and Prospects
by: Rakka, Mariam, et al.
Published: (2025)

MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
by: Duanmu, Haojie, et al.
Published: (2025)

MixtureKit: A General Framework for Composing, Training, and Visualizing Mixture-of-Experts Models
by: Chamma, Ahmad, et al.
Published: (2025)