Saved in:
| Main Authors: | Qiu, Haiyun, Wu, Xingyu, Feng, Liang, Tan, Kay Chen |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.06552 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Adaptive Continual Model Merging via Manifold-Aware Expert Evolution
by: Qiu, Haiyun, et al.
Published: (2026)
by: Qiu, Haiyun, et al.
Published: (2026)
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
by: Zhou, Yu, et al.
Published: (2024)
by: Zhou, Yu, et al.
Published: (2024)
Design Principle Transfer in Neural Architecture Search via Large Language Models
by: Zhou, Xun, et al.
Published: (2024)
by: Zhou, Xun, et al.
Published: (2024)
Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking
by: Zhang, Dengming, et al.
Published: (2025)
by: Zhang, Dengming, et al.
Published: (2025)
FRISM: Fine-Grained Reasoning Injection via Subspace-Level Model Merging for Vision-Language Models
by: Huang, Chenyu, et al.
Published: (2026)
by: Huang, Chenyu, et al.
Published: (2026)
CausalBench: A Comprehensive Benchmark for Causal Learning Capability of LLMs
by: Zhou, Yu, et al.
Published: (2024)
by: Zhou, Yu, et al.
Published: (2024)
Structural Priors and Modular Adapters in the Composable Fine-Tuning Algorithm of Large-Scale Models
by: Wang, Yuxiao, et al.
Published: (2025)
by: Wang, Yuxiao, et al.
Published: (2025)
Train Separately, Merge Together: Modular Post-Training with Mixture-of-Experts
by: Morrison, Jacob, et al.
Published: (2026)
by: Morrison, Jacob, et al.
Published: (2026)
RainSeer: Fine-Grained Rainfall Reconstruction via Physics-Guided Modeling
by: Chen, Lin, et al.
Published: (2025)
by: Chen, Lin, et al.
Published: (2025)
Diversity-Aware Policy Optimization for Large Language Model Reasoning
by: Yao, Jian, et al.
Published: (2025)
by: Yao, Jian, et al.
Published: (2025)
LLM Cannot Discover Causality, and Should Be Restricted to Non-Decisional Support in Causal Discovery
by: Wu, Xingyu, et al.
Published: (2025)
by: Wu, Xingyu, et al.
Published: (2025)
Expert Merging in Sparse Mixture of Experts with Nash Bargaining
by: Nguyen, Dung V., et al.
Published: (2025)
by: Nguyen, Dung V., et al.
Published: (2025)
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
by: Lu, Zhenyi, et al.
Published: (2024)
by: Lu, Zhenyi, et al.
Published: (2024)
CAMEx: Curvature-aware Merging of Experts
by: Nguyen, Dung V., et al.
Published: (2025)
by: Nguyen, Dung V., et al.
Published: (2025)
MergeMoE: Efficient Compression of MoE Models via Expert Output Merging
by: Miao, Ruijie, et al.
Published: (2025)
by: Miao, Ruijie, et al.
Published: (2025)
Large Language Model-Enhanced Algorithm Selection: Towards Comprehensive Algorithm Representation
by: Wu, Xingyu, et al.
Published: (2023)
by: Wu, Xingyu, et al.
Published: (2023)
How Multimodal Integration Boost the Performance of LLM for Optimization: Case Study on Capacitated Vehicle Routing Problems
by: Huang, Yuxiao, et al.
Published: (2024)
by: Huang, Yuxiao, et al.
Published: (2024)
Vanishing Feature: Diagnosing Model Merging and Beyond
by: Qu, Xingyu, et al.
Published: (2024)
by: Qu, Xingyu, et al.
Published: (2024)
Certain Head, Uncertain Tail: Expert-Sample for Test-Time Scaling in Fine-Grained MoE
by: Chen, Yuanteng, et al.
Published: (2026)
by: Chen, Yuanteng, et al.
Published: (2026)
MIN-Merging: Merge the Important Neurons for Model Merging
by: Liang, Yunfei
Published: (2025)
by: Liang, Yunfei
Published: (2025)
Learning More Generalized Experts by Merging Experts in Mixture-of-Experts
by: Park, Sejik
Published: (2024)
by: Park, Sejik
Published: (2024)
Modular Diffusion Policy Training: Decoupling and Recombining Guidance and Diffusion for Offline RL
by: Chen, Zhaoyang, et al.
Published: (2025)
by: Chen, Zhaoyang, et al.
Published: (2025)
FedMerge: Federated Personalization via Model Merging
by: Chen, Shutong, et al.
Published: (2025)
by: Chen, Shutong, et al.
Published: (2025)
Why Do More Experts Fail? A Theoretical Analysis of Model Merging
by: Wang, Zijing, et al.
Published: (2025)
by: Wang, Zijing, et al.
Published: (2025)
Soft Merging of Experts with Adaptive Routing
by: Muqeeth, Mohammed, et al.
Published: (2023)
by: Muqeeth, Mohammed, et al.
Published: (2023)
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert Merging
by: Li, Lujun, et al.
Published: (2025)
by: Li, Lujun, et al.
Published: (2025)
CRAFT: Fine-Grained Cost-Aware Expert Replication For Efficient Mixture-of-Experts Serving
by: Zhao, Adrian, et al.
Published: (2026)
by: Zhao, Adrian, et al.
Published: (2026)
MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
by: Qiu, Zihuan, et al.
Published: (2025)
by: Qiu, Zihuan, et al.
Published: (2025)
Channel Merging: Preserving Specialization for Merged Experts
by: Zhang, Mingyang, et al.
Published: (2024)
by: Zhang, Mingyang, et al.
Published: (2024)
Unlock the Power of Algorithm Features: A Generalization Analysis for Algorithm Selection
by: Wu, Xingyu, et al.
Published: (2024)
by: Wu, Xingyu, et al.
Published: (2024)
Scaling Laws for Fine-Grained Mixture of Experts
by: Krajewski, Jakub, et al.
Published: (2024)
by: Krajewski, Jakub, et al.
Published: (2024)
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
by: Tang, Anke, et al.
Published: (2024)
by: Tang, Anke, et al.
Published: (2024)
Superpose Task-specific Features for Model Merging
by: Qiu, Haiquan, et al.
Published: (2025)
by: Qiu, Haiquan, et al.
Published: (2025)
CoMoE: Contrastive Representation for Mixture-of-Experts in Parameter-Efficient Fine-tuning
by: Feng, Jinyuan, et al.
Published: (2025)
by: Feng, Jinyuan, et al.
Published: (2025)
Local Mixtures of Experts: Essentially Free Test-Time Training via Model Merging
by: Bertolissi, Ryo, et al.
Published: (2025)
by: Bertolissi, Ryo, et al.
Published: (2025)
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging
by: Hui, Tingfeng, et al.
Published: (2024)
by: Hui, Tingfeng, et al.
Published: (2024)
PuzzleMoE: Efficient Compression of Large Mixture-of-Experts Models via Sparse Expert Merging and Bit-packed inference
by: Zhao, Yushu, et al.
Published: (2025)
by: Zhao, Yushu, et al.
Published: (2025)
Access Sets Matter: Budgeting Expert Reads for Scalable Weight-Space Model Merging
by: Wang, Yuanyi, et al.
Published: (2026)
by: Wang, Yuanyi, et al.
Published: (2026)
Composition of Experts: A Modular Compound AI System Leveraging Large Language Models
by: Jain, Swayambhoo, et al.
Published: (2024)
by: Jain, Swayambhoo, et al.
Published: (2024)
Can Muon Fine-tune Adam-Pretrained Models?
by: Qu, Xingyu, et al.
Published: (2026)
by: Qu, Xingyu, et al.
Published: (2026)
Similar Items
-
Towards Adaptive Continual Model Merging via Manifold-Aware Expert Evolution
by: Qiu, Haiyun, et al.
Published: (2026) -
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
by: Zhou, Yu, et al.
Published: (2024) -
Design Principle Transfer in Neural Architecture Search via Large Language Models
by: Zhou, Xun, et al.
Published: (2024) -
Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking
by: Zhang, Dengming, et al.
Published: (2025) -
FRISM: Fine-Grained Reasoning Injection via Subspace-Level Model Merging for Vision-Language Models
by: Huang, Chenyu, et al.
Published: (2026)