Saved in:
| Main Authors: | Wu, Jing, Lai, Zhixin, Chen, Suiyao, Tao, Ran, Zhao, Pan, Hovakimyan, Naira |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.19839 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Residual-based Language Models are Free Boosters for Biomedical Imaging
by: Lai, Zhixin, et al.
Published: (2024)
by: Lai, Zhixin, et al.
Published: (2024)
CROPS: A Deployable Crop Management System Over All Possible State Availabilities
by: Wu, Jing, et al.
Published: (2024)
by: Wu, Jing, et al.
Published: (2024)
Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection
by: Lai, Zhixin, et al.
Published: (2024)
by: Lai, Zhixin, et al.
Published: (2024)
Towards a Robust Retrieval-Based Summarization System
by: Liu, Shengjie, et al.
Published: (2024)
by: Liu, Shengjie, et al.
Published: (2024)
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
by: Wang, Zihan, et al.
Published: (2024)
by: Wang, Zihan, et al.
Published: (2024)
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
by: Pan, Bowen, et al.
Published: (2024)
by: Pan, Bowen, et al.
Published: (2024)
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
by: Lu, Xudong, et al.
Published: (2024)
by: Lu, Xudong, et al.
Published: (2024)
ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models
by: Liu, Jing, et al.
Published: (2024)
by: Liu, Jing, et al.
Published: (2024)
\$OneMillion-Bench: How Far are Language Agents from Human Experts?
by: Yang, Qianyu, et al.
Published: (2026)
by: Yang, Qianyu, et al.
Published: (2026)
Mixture of Heterogeneous Grouped Experts for Language Modeling
by: Ma, Zhicheng, et al.
Published: (2026)
by: Ma, Zhicheng, et al.
Published: (2026)
Upcycling Large Language Models into Mixture of Experts
by: He, Ethan, et al.
Published: (2024)
by: He, Ethan, et al.
Published: (2024)
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models
by: Kim, Gyeongman, et al.
Published: (2025)
by: Kim, Gyeongman, et al.
Published: (2025)
Autonomy-of-Experts Models
by: Lv, Ang, et al.
Published: (2025)
by: Lv, Ang, et al.
Published: (2025)
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models
by: Zhao, Zhongyu, et al.
Published: (2024)
by: Zhao, Zhongyu, et al.
Published: (2024)
Scaling Embeddings Outperforms Scaling Experts in Language Models
by: Liu, Hong, et al.
Published: (2026)
by: Liu, Hong, et al.
Published: (2026)
Pruning and Distilling Mixture-of-Experts into Dense Language Models
by: Kim, Junhyuck, et al.
Published: (2026)
by: Kim, Junhyuck, et al.
Published: (2026)
OLMoE: Open Mixture-of-Experts Language Models
by: Muennighoff, Niklas, et al.
Published: (2024)
by: Muennighoff, Niklas, et al.
Published: (2024)
Trustworthy Summarization via Uncertainty Quantification and Risk Awareness in Large Language Models
by: Pan, Shuaidong, et al.
Published: (2025)
by: Pan, Shuaidong, et al.
Published: (2025)
Harnessing Transfer Learning from Swahili: Advancing Solutions for Comorian Dialects
by: Mohamed, Naira Abdou, et al.
Published: (2024)
by: Mohamed, Naira Abdou, et al.
Published: (2024)
In-context Autoencoder for Context Compression in a Large Language Model
by: Ge, Tao, et al.
Published: (2023)
by: Ge, Tao, et al.
Published: (2023)
Deep Representation Learning for Multi-functional Degradation Modeling of Community-dwelling Aging Population
by: Chen, Suiyao, et al.
Published: (2024)
by: Chen, Suiyao, et al.
Published: (2024)
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks
by: Nakamura, Taishi, et al.
Published: (2025)
by: Nakamura, Taishi, et al.
Published: (2025)
MouSi: Poly-Visual-Expert Vision-Language Models
by: Fan, Xiaoran, et al.
Published: (2024)
by: Fan, Xiaoran, et al.
Published: (2024)
Systematic Outliers in Large Language Models
by: An, Yongqi, et al.
Published: (2025)
by: An, Yongqi, et al.
Published: (2025)
Rethinking Data Mixing from the Perspective of Large Language Models
by: Xu, Yuanjian, et al.
Published: (2026)
by: Xu, Yuanjian, et al.
Published: (2026)
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
by: Liu, Zechun, et al.
Published: (2024)
by: Liu, Zechun, et al.
Published: (2024)
Stochastic Parrots or ICU Experts? Large Language Models in Critical Care Medicine: A Scoping Review
by: Shi, Tongyue, et al.
Published: (2024)
by: Shi, Tongyue, et al.
Published: (2024)
SambaLingo: Teaching Large Language Models New Languages
by: Csaki, Zoltan, et al.
Published: (2024)
by: Csaki, Zoltan, et al.
Published: (2024)
Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors
by: Huang, Jing, et al.
Published: (2025)
by: Huang, Jing, et al.
Published: (2025)
ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease Diagnosis
by: Wang, Xu, et al.
Published: (2025)
by: Wang, Xu, et al.
Published: (2025)
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
by: Liu, Akide, et al.
Published: (2024)
by: Liu, Akide, et al.
Published: (2024)
MedVAL: Toward Expert-Level Medical Text Validation with Language Models
by: Aali, Asad, et al.
Published: (2025)
by: Aali, Asad, et al.
Published: (2025)
Multi-Objective Large Language Model Unlearning
by: Pan, Zibin, et al.
Published: (2024)
by: Pan, Zibin, et al.
Published: (2024)
When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
by: Xu, Haoming, et al.
Published: (2026)
by: Xu, Haoming, et al.
Published: (2026)
FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models
by: Jiang, Juyong, et al.
Published: (2026)
by: Jiang, Juyong, et al.
Published: (2026)
Composition of Experts: A Modular Compound AI System Leveraging Large Language Models
by: Jain, Swayambhoo, et al.
Published: (2024)
by: Jain, Swayambhoo, et al.
Published: (2024)
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models
by: Nguyen, Nam V., et al.
Published: (2024)
by: Nguyen, Nam V., et al.
Published: (2024)
Bregman Centroid Guided Cross-Entropy Method
by: Gu, Yuliang, et al.
Published: (2025)
by: Gu, Yuliang, et al.
Published: (2025)
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
by: Liu, Xiaoze, et al.
Published: (2024)
by: Liu, Xiaoze, et al.
Published: (2024)
The Expert Strikes Back: Interpreting Mixture-of-Experts Language Models at Expert Level
by: Herbst, Jeremy, et al.
Published: (2026)
by: Herbst, Jeremy, et al.
Published: (2026)
Similar Items
-
Residual-based Language Models are Free Boosters for Biomedical Imaging
by: Lai, Zhixin, et al.
Published: (2024) -
CROPS: A Deployable Crop Management System Over All Possible State Availabilities
by: Wu, Jing, et al.
Published: (2024) -
Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection
by: Lai, Zhixin, et al.
Published: (2024) -
Towards a Robust Retrieval-Based Summarization System
by: Liu, Shengjie, et al.
Published: (2024) -
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
by: Wang, Zihan, et al.
Published: (2024)