Saved in:
| Main Authors: | Huang, Qionghao, Lu, Lingnuo, Wu, Xuemei, Jiang, Fan, Wang, Xizhe, Wang, Xun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.13092 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Mastering the Minority: An Uncertainty-guided Multi-Expert Framework for Challenging-tailed Sequence Learning
by: Wang, Ye, et al.
Published: (2026)
by: Wang, Ye, et al.
Published: (2026)
Multi-Head Mixture-of-Experts
by: Wu, Xun, et al.
Published: (2024)
by: Wu, Xun, et al.
Published: (2024)
Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts
by: Yuan, Yike, et al.
Published: (2025)
by: Yuan, Yike, et al.
Published: (2025)
Walrus: A Cross-Domain Foundation Model for Continuum Dynamics
by: McCabe, Michael, et al.
Published: (2025)
by: McCabe, Michael, et al.
Published: (2025)
Route Experts by Sequence, not by Token
by: Wen, Tiansheng, et al.
Published: (2025)
by: Wen, Tiansheng, et al.
Published: (2025)
Finding Fantastic Experts in MoEs: A Unified Study for Expert Dropping Strategies and Observations
by: Jaiswal, Ajay, et al.
Published: (2025)
by: Jaiswal, Ajay, et al.
Published: (2025)
MIPS: a Multimodal Infinite Polymer Sequence Pre-training Framework for Polymer Property Prediction
by: Wang, Jiaxi, et al.
Published: (2025)
by: Wang, Jiaxi, et al.
Published: (2025)
Curriculum reinforcement learning with measurable task representation learning
by: Wen, Yongyan, et al.
Published: (2026)
by: Wen, Yongyan, et al.
Published: (2026)
Accelerating Mixture-of-Expert Inference with Adaptive Expert Split Mechanism
by: Yan, Jiaming, et al.
Published: (2025)
by: Yan, Jiaming, et al.
Published: (2025)
Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes
by: Bandyopadhyay, Dibyanayan, et al.
Published: (2024)
by: Bandyopadhyay, Dibyanayan, et al.
Published: (2024)
FitText: Evolving Agent Tool Ecologies via Memetic Retrieval
by: Zheng, Kyle, et al.
Published: (2026)
by: Zheng, Kyle, et al.
Published: (2026)
Pretext Training Algorithms for Event Sequence Data
by: Wang, Yimu, et al.
Published: (2024)
by: Wang, Yimu, et al.
Published: (2024)
Artificial Intelligence and Deep Learning Algorithms for Epigenetic Sequence Analysis: A Review for Epigeneticists and AI Experts
by: Tahir, Muhammad, et al.
Published: (2025)
by: Tahir, Muhammad, et al.
Published: (2025)
AnyExperts: On-Demand Expert Allocation for Multimodal Language Models with Mixture of Expert
by: Gao, Yuting, et al.
Published: (2025)
by: Gao, Yuting, et al.
Published: (2025)
Machine Learning-Based Genomic Linguistic Analysis (Gene Sequence Feature Learning): A Case Study on Predicting Heavy Metal Response Genes in Rice
by: Yang, Ruiqi, et al.
Published: (2025)
by: Yang, Ruiqi, et al.
Published: (2025)
Collaborative Adaptive Curriculum for Progressive Knowledge Distillation
by: Liu, Jing, et al.
Published: (2026)
by: Liu, Jing, et al.
Published: (2026)
AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering
by: Cai, Yuzhu, et al.
Published: (2026)
by: Cai, Yuzhu, et al.
Published: (2026)
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
by: Zhao, Zirui, et al.
Published: (2024)
by: Zhao, Zirui, et al.
Published: (2024)
Sharpening the Spear: Adaptive Expert-Guided Adversarial Attack Against DRL-based Autonomous Driving Policies
by: Fan, Junchao, et al.
Published: (2025)
by: Fan, Junchao, et al.
Published: (2025)
M$^2$FMoE: Multi-Resolution Multi-View Frequency Mixture-of-Experts for Extreme-Adaptive Time Series Forecasting
by: Huang, Yaohui, et al.
Published: (2026)
by: Huang, Yaohui, et al.
Published: (2026)
A Curriculum Learning Approach to Reinforcement Learning: Leveraging RAG for Multimodal Question Answering
by: Zhang, Chenliang, et al.
Published: (2025)
by: Zhang, Chenliang, et al.
Published: (2025)
AdaCuRL: Adaptive Curriculum Reinforcement Learning with Invalid Sample Mitigation and Historical Revisiting
by: Li, Renda, et al.
Published: (2025)
by: Li, Renda, et al.
Published: (2025)
Bandit Guided Submodular Curriculum for Adaptive Subset Selection
by: Chanda, Prateek, et al.
Published: (2025)
by: Chanda, Prateek, et al.
Published: (2025)
UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
by: Huang, Minbin, et al.
Published: (2026)
by: Huang, Minbin, et al.
Published: (2026)
Gradient Descent Algorithm Survey
by: Fucheng, Deng, et al.
Published: (2025)
by: Fucheng, Deng, et al.
Published: (2025)
Geometric Mixture-of-Experts with Curvature-Guided Adaptive Routing for Graph Representation Learning
by: Cao, Haifang, et al.
Published: (2026)
by: Cao, Haifang, et al.
Published: (2026)
Dynamic Adaptive Shared Experts with Grouped Multi-Head Attention Mixture of Experts
by: Li, Cheng, et al.
Published: (2025)
by: Li, Cheng, et al.
Published: (2025)
Textual Aesthetics in Large Language Models
by: Jiang, Lingjie, et al.
Published: (2024)
by: Jiang, Lingjie, et al.
Published: (2024)
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
by: Mu, Siyuan, et al.
Published: (2025)
by: Mu, Siyuan, et al.
Published: (2025)
Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline
by: Fang, Zhiyuan, et al.
Published: (2025)
by: Fang, Zhiyuan, et al.
Published: (2025)
DBES: A Systematic Benchmark and Metric Suite for Evaluating Expert Specialization in Large-Scale MoEs
by: Wang, Jing, et al.
Published: (2026)
by: Wang, Jing, et al.
Published: (2026)
SADA: Stability-guided Adaptive Diffusion Acceleration
by: Jiang, Ting, et al.
Published: (2025)
by: Jiang, Ting, et al.
Published: (2025)
PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction
by: Zhao, Yuxiang, et al.
Published: (2024)
by: Zhao, Yuxiang, et al.
Published: (2024)
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
by: Zhao, Hao, et al.
Published: (2024)
by: Zhao, Hao, et al.
Published: (2024)
A Self-guided Multimodal Approach to Enhancing Graph Representation Learning for Alzheimer's Diseases
by: Wang, Zhepeng, et al.
Published: (2024)
by: Wang, Zhepeng, et al.
Published: (2024)
Enhancing DP-SGD through Non-monotonous Adaptive Scaling Gradient Weight
by: Huang, Tao, et al.
Published: (2024)
by: Huang, Tao, et al.
Published: (2024)
Research on Personalized Medical Intervention Strategy Generation System based on Group Relative Policy Optimization and Time-Series Data Fusion
by: Lu, Dingxin, et al.
Published: (2025)
by: Lu, Dingxin, et al.
Published: (2025)
SD-MoE: Spectral Decomposition for Effective Expert Specialization
by: Huang, Ruijun, et al.
Published: (2026)
by: Huang, Ruijun, et al.
Published: (2026)
Reparameterization Proximal Policy Optimization
by: Zhong, Hai, et al.
Published: (2025)
by: Zhong, Hai, et al.
Published: (2025)
Reparameterization Flow Policy Optimization
by: Zhong, Hai, et al.
Published: (2026)
by: Zhong, Hai, et al.
Published: (2026)
Similar Items
-
Mastering the Minority: An Uncertainty-guided Multi-Expert Framework for Challenging-tailed Sequence Learning
by: Wang, Ye, et al.
Published: (2026) -
Multi-Head Mixture-of-Experts
by: Wu, Xun, et al.
Published: (2024) -
Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts
by: Yuan, Yike, et al.
Published: (2025) -
Walrus: A Cross-Domain Foundation Model for Continuum Dynamics
by: McCabe, Michael, et al.
Published: (2025) -
Route Experts by Sequence, not by Token
by: Wen, Tiansheng, et al.
Published: (2025)