Saved in:
| Main Authors: | Petridis, Savvas, Wedin, Ben, Yuan, Ann, Wexler, James, Thain, Nithum |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.04894 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Choose Your Agent: Tradeoffs in Adopting AI Advisors, Coaches, and Delegates in Multi-Party Negotiation
by: Zhu, Kehang, et al.
Published: (2026)
by: Zhu, Kehang, et al.
Published: (2026)
Strategic Tradeoffs Between Humans and AI in Multi-Agent Bargaining
by: Qian, Crystal, et al.
Published: (2025)
by: Qian, Crystal, et al.
Published: (2025)
Thinking Like a Scientist: Can Interactive Simulations Foster Critical AI Literacy?
by: Zhao, Yiling, et al.
Published: (2025)
by: Zhao, Yiling, et al.
Published: (2025)
Improving Neutral Point-of-View Generation with Data- and Parameter-Efficient RL
by: Hoffmann, Jessica, et al.
Published: (2025)
by: Hoffmann, Jessica, et al.
Published: (2025)
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
by: Sukhbaatar, Sainbayar, et al.
Published: (2024)
by: Sukhbaatar, Sainbayar, et al.
Published: (2024)
Superposition in Transformers: A Novel Way of Building Mixture of Experts
by: Chaliah, Ayoub Ben, et al.
Published: (2024)
by: Chaliah, Ayoub Ben, et al.
Published: (2024)
MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper
by: Zeng, Runjia, et al.
Published: (2025)
by: Zeng, Runjia, et al.
Published: (2025)
Inverse Constitutional AI: Compressing Preferences into Principles
by: Findeis, Arduin, et al.
Published: (2024)
by: Findeis, Arduin, et al.
Published: (2024)
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
by: Wang, Ruochen, et al.
Published: (2024)
by: Wang, Ruochen, et al.
Published: (2024)
Stealing User Prompts from Mixture of Experts
by: Yona, Itay, et al.
Published: (2024)
by: Yona, Itay, et al.
Published: (2024)
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
by: Liu, Zhili, et al.
Published: (2024)
by: Liu, Zhili, et al.
Published: (2024)
Training Sparse Mixture Of Experts Text Embedding Models
by: Nussbaum, Zach, et al.
Published: (2025)
by: Nussbaum, Zach, et al.
Published: (2025)
CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition
by: Nguyen, Nam V., et al.
Published: (2025)
by: Nguyen, Nam V., et al.
Published: (2025)
Deliberate Lab: A Platform for Real-Time Human-AI Social Experiments
by: Qian, Crystal, et al.
Published: (2025)
by: Qian, Crystal, et al.
Published: (2025)
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
by: Pan, Bowen, et al.
Published: (2024)
by: Pan, Bowen, et al.
Published: (2024)
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
by: Gritsch, Nikolas, et al.
Published: (2024)
by: Gritsch, Nikolas, et al.
Published: (2024)
Automatic Histograms: Leveraging Language Models for Text Dataset Exploration
by: Reif, Emily, et al.
Published: (2024)
by: Reif, Emily, et al.
Published: (2024)
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
by: Wei, Tianwen, et al.
Published: (2024)
by: Wei, Tianwen, et al.
Published: (2024)
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
by: Nakamura, Taishi, et al.
Published: (2025)
by: Nakamura, Taishi, et al.
Published: (2025)
Routing by Analogy: kNN-Augmented Expert Assignment for Mixture-of-Experts
by: Lyu, Boxuan, et al.
Published: (2026)
by: Lyu, Boxuan, et al.
Published: (2026)
ExpertFlow: Efficient Mixture-of-Experts Inference via Predictive Expert Caching and Token Scheduling
by: He, Xin, et al.
Published: (2024)
by: He, Xin, et al.
Published: (2024)
LD-MoLE: Learnable Dynamic Routing for Mixture of LoRA Experts
by: Zhuang, Yuan, et al.
Published: (2025)
by: Zhuang, Yuan, et al.
Published: (2025)
QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Experts
by: Li, Pingzhi, et al.
Published: (2024)
by: Li, Pingzhi, et al.
Published: (2024)
Pre-Attention Expert Prediction and Prefetching for Mixture-of-Experts Large Language Models
by: Zhu, Shien, et al.
Published: (2025)
by: Zhu, Shien, et al.
Published: (2025)
ExpertPrompting: Instructing Large Language Models to be Distinguished Experts
by: Xu, Benfeng, et al.
Published: (2023)
by: Xu, Benfeng, et al.
Published: (2023)
Multi-Head Mixture-of-Experts
by: Wu, Xun, et al.
Published: (2024)
by: Wu, Xun, et al.
Published: (2024)
Routing-Free Mixture-of-Experts
by: Liu, Yilun, et al.
Published: (2026)
by: Liu, Yilun, et al.
Published: (2026)
Multilingual Routing in Mixture-of-Experts
by: Bandarkar, Lucas, et al.
Published: (2025)
by: Bandarkar, Lucas, et al.
Published: (2025)
MoPE: Mixture of Prompt Experts for Parameter-Efficient and Scalable Multimodal Fusion
by: Jiang, Ruixiang, et al.
Published: (2024)
by: Jiang, Ruixiang, et al.
Published: (2024)
SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?
by: Zhuang, Haomin, et al.
Published: (2024)
by: Zhuang, Haomin, et al.
Published: (2024)
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer
by: Liu, Boan, et al.
Published: (2023)
by: Liu, Boan, et al.
Published: (2023)
Mixture-of-Experts with Intermediate CTC Supervision for Accented Speech Recognition
by: Lee, Wonjun, et al.
Published: (2026)
by: Lee, Wonjun, et al.
Published: (2026)
Group then Scale: Dynamic Mixture-of-Experts Multilingual Language Model
by: Li, Chong, et al.
Published: (2025)
by: Li, Chong, et al.
Published: (2025)
On the Spatial Structure of Mixture-of-Experts in Transformers
by: Bershatsky, Daniel, et al.
Published: (2025)
by: Bershatsky, Daniel, et al.
Published: (2025)
MoxE: Mixture of xLSTM Experts with Entropy-Aware Routing for Efficient Language Modeling
by: Thiombiano, Abdoul Majid O., et al.
Published: (2025)
by: Thiombiano, Abdoul Majid O., et al.
Published: (2025)
Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts
by: Lee, Rhui Dih, et al.
Published: (2024)
by: Lee, Rhui Dih, et al.
Published: (2024)
C3AI: Crafting and Evaluating Constitutions for Constitutional AI
by: Kyrychenko, Yara, et al.
Published: (2025)
by: Kyrychenko, Yara, et al.
Published: (2025)
Towards a Comprehensive Scaling Law of Mixture-of-Experts
by: Zhao, Guoliang, et al.
Published: (2025)
by: Zhao, Guoliang, et al.
Published: (2025)
Supervisory Prompt Training
by: Billa, Jean Ghislain, et al.
Published: (2024)
by: Billa, Jean Ghislain, et al.
Published: (2024)
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
by: Lu, Xudong, et al.
Published: (2024)
by: Lu, Xudong, et al.
Published: (2024)
Similar Items
-
Choose Your Agent: Tradeoffs in Adopting AI Advisors, Coaches, and Delegates in Multi-Party Negotiation
by: Zhu, Kehang, et al.
Published: (2026) -
Strategic Tradeoffs Between Humans and AI in Multi-Agent Bargaining
by: Qian, Crystal, et al.
Published: (2025) -
Thinking Like a Scientist: Can Interactive Simulations Foster Critical AI Literacy?
by: Zhao, Yiling, et al.
Published: (2025) -
Improving Neutral Point-of-View Generation with Data- and Parameter-Efficient RL
by: Hoffmann, Jessica, et al.
Published: (2025) -
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
by: Sukhbaatar, Sainbayar, et al.
Published: (2024)