Saved in:
| Main Authors: | Tran, Viet-Hoang, Trinh, Van Hoan, Bui, Khanh Vinh, Nguyen, Tan M. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.11348 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
by: Nguyen-Nhat, Minh-Khoi, et al.
Published: (2025)
by: Nguyen-Nhat, Minh-Khoi, et al.
Published: (2025)
A Mixture of Experts Vision Transformer for High-Fidelity Surface Code Decoding
by: Nguyen, Hoang Viet, et al.
Published: (2026)
by: Nguyen, Hoang Viet, et al.
Published: (2026)
Overview of the VLSP 2023 -- ComOM Shared Task: A Data Challenge for Comparative Opinion Mining from Vietnamese Product Reviews
by: Le, Hoang-Quynh, et al.
Published: (2024)
by: Le, Hoang-Quynh, et al.
Published: (2024)
Expert Merging in Sparse Mixture of Experts with Nash Bargaining
by: Nguyen, Dung V., et al.
Published: (2025)
by: Nguyen, Dung V., et al.
Published: (2025)
YRC-Bench: A Benchmark for Learning to Coordinate with Experts
by: Danesh, Mohamad H., et al.
Published: (2025)
by: Danesh, Mohamad H., et al.
Published: (2025)
Monomial Matrix Group Equivariant Neural Functional Networks
by: Tran, Viet-Hoang, et al.
Published: (2024)
by: Tran, Viet-Hoang, et al.
Published: (2024)
Selective Sinkhorn Routing for Improved Sparse Mixture of Experts
by: Nguyen, Duc Anh, et al.
Published: (2025)
by: Nguyen, Duc Anh, et al.
Published: (2025)
Quasi-Equivariant Metanetworks
by: Tran, Viet-Hoang, et al.
Published: (2026)
by: Tran, Viet-Hoang, et al.
Published: (2026)
A Statistical Theory of Gated Attention through the Lens of Hierarchical Mixture of Experts
by: Nguyen, Viet, et al.
Published: (2026)
by: Nguyen, Viet, et al.
Published: (2026)
A Clifford Algebraic Approach to E(n)-Equivariant High-order Graph Neural Networks
by: Tran, Viet-Hoang, et al.
Published: (2024)
by: Tran, Viet-Hoang, et al.
Published: (2024)
Bayesian Optimization for Unknown Cost-Varying Variable Subsets with No-Regret Costs
by: Hoang, Vu Viet, et al.
Published: (2024)
by: Hoang, Vu Viet, et al.
Published: (2024)
Rethinking Multinomial Logistic Mixture of Experts with Sigmoid Gating Function
by: Pham, Tuan Minh, et al.
Published: (2026)
by: Pham, Tuan Minh, et al.
Published: (2026)
Improving Routing in Sparse Mixture of Experts with Graph of Tokens
by: Nguyen, Tam, et al.
Published: (2025)
by: Nguyen, Tam, et al.
Published: (2025)
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models
by: Nguyen, Nam V., et al.
Published: (2024)
by: Nguyen, Nam V., et al.
Published: (2024)
N-EIoU-YOLOv9: A Signal-Aware Bounding Box Regression Loss for Lightweight Mobile Detection of Rice Leaf Diseases
by: Duc, Dung Ta Nguyen, et al.
Published: (2026)
by: Duc, Dung Ta Nguyen, et al.
Published: (2026)
High Dimensional Bayesian Optimization using Lasso Variable Selection
by: Hoang, Vu Viet, et al.
Published: (2025)
by: Hoang, Vu Viet, et al.
Published: (2025)
Mixture of Experts Meets Prompt-Based Continual Learning
by: Le, Minh, et al.
Published: (2024)
by: Le, Minh, et al.
Published: (2024)
UB-SMoE: Universally Balanced Sparse Mixture-of-Experts for Resource-adaptive Federated Fine-tuning of Foundation Models
by: Tran, Van-Tuan, et al.
Published: (2026)
by: Tran, Van-Tuan, et al.
Published: (2026)
Mixture-of-Personas Language Models for Population Simulation
by: Bui, Ngoc, et al.
Published: (2025)
by: Bui, Ngoc, et al.
Published: (2025)
MP-MoE: Matrix Profile-Guided Mixture of Experts for Precipitation Forecasting
by: Tran, Huyen Ngoc, et al.
Published: (2026)
by: Tran, Huyen Ngoc, et al.
Published: (2026)
Tree-Sliced Wasserstein Distance: A Geometric Perspective
by: Tran, Viet-Hoang, et al.
Published: (2024)
by: Tran, Viet-Hoang, et al.
Published: (2024)
Equivariant Polynomial Functional Networks
by: Vo, Thieu N., et al.
Published: (2024)
by: Vo, Thieu N., et al.
Published: (2024)
Equivariant Neural Functional Networks for Transformers
by: Tran, Viet-Hoang, et al.
Published: (2024)
by: Tran, Viet-Hoang, et al.
Published: (2024)
Spherical Tree-Sliced Wasserstein Distance
by: Tran, Viet-Hoang, et al.
Published: (2025)
by: Tran, Viet-Hoang, et al.
Published: (2025)
Vehicle Routing Problems via Quantum Graph Attention Network Deep Reinforcement Learning
by: Giang, Le Tung, et al.
Published: (2025)
by: Giang, Le Tung, et al.
Published: (2025)
Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A
by: Plaut, Benjamin, et al.
Published: (2024)
by: Plaut, Benjamin, et al.
Published: (2024)
Foundations of Artificial Intelligence Frameworks: Notion and Limits of AGI
by: Bui, Khanh Gia
Published: (2025)
by: Bui, Khanh Gia
Published: (2025)
Test-time Diverse Reasoning by Riemannian Activation Steering
by: Khanh, Ly Tran Ho, et al.
Published: (2025)
by: Khanh, Ly Tran Ho, et al.
Published: (2025)
Landscaping Linear Mode Connectivity
by: Singh, Sidak Pal, et al.
Published: (2024)
by: Singh, Sidak Pal, et al.
Published: (2024)
Towards Layer-Wise Personalized Federated Learning: Adaptive Layer Disentanglement via Conflicting Gradients
by: Nguyen, Minh Duong, et al.
Published: (2024)
by: Nguyen, Minh Duong, et al.
Published: (2024)
Revisiting LARS for Large Batch Training Generalization of Neural Networks
by: Do, Khoi, et al.
Published: (2023)
by: Do, Khoi, et al.
Published: (2023)
Tree-Sliced Wasserstein Distance with Nonlinear Projection
by: Tran, Thanh, et al.
Published: (2025)
by: Tran, Thanh, et al.
Published: (2025)
Cross-Modality Controlled Molecule Generation with Diffusion Language Model
by: Zhang, Yunzhe, et al.
Published: (2025)
by: Zhang, Yunzhe, et al.
Published: (2025)
Dynamical Properties of Tokens in Self-Attention and Effects of Positional Encoding
by: Pham, Duy-Tung, et al.
Published: (2025)
by: Pham, Duy-Tung, et al.
Published: (2025)
One-Prompt Strikes Back: Sparse Mixture of Experts for Prompt-based Continual Learning
by: Le, Minh, et al.
Published: (2025)
by: Le, Minh, et al.
Published: (2025)
Highly Efficient and Effective LLMs with Multi-Boolean Architectures
by: Tran, Ba-Hien, et al.
Published: (2025)
by: Tran, Ba-Hien, et al.
Published: (2025)
On Parameter Estimation in Deviated Gaussian Mixture of Experts
by: Nguyen, Huy, et al.
Published: (2024)
by: Nguyen, Huy, et al.
Published: (2024)
Automatic Prompt Selection for Large Language Models
by: Do, Viet-Tung, et al.
Published: (2024)
by: Do, Viet-Tung, et al.
Published: (2024)
Revisiting Kernel Attention with Correlated Gaussian Process Representation
by: Bui, Long Minh, et al.
Published: (2025)
by: Bui, Long Minh, et al.
Published: (2025)
Generalized Linear Mode Connectivity for Transformers
by: Theus, Alexander, et al.
Published: (2025)
by: Theus, Alexander, et al.
Published: (2025)
Similar Items
-
Modeling Expert Interactions in Sparse Mixture of Experts via Graph Structures
by: Nguyen-Nhat, Minh-Khoi, et al.
Published: (2025) -
A Mixture of Experts Vision Transformer for High-Fidelity Surface Code Decoding
by: Nguyen, Hoang Viet, et al.
Published: (2026) -
Overview of the VLSP 2023 -- ComOM Shared Task: A Data Challenge for Comparative Opinion Mining from Vietnamese Product Reviews
by: Le, Hoang-Quynh, et al.
Published: (2024) -
Expert Merging in Sparse Mixture of Experts with Nash Bargaining
by: Nguyen, Dung V., et al.
Published: (2025) -
YRC-Bench: A Benchmark for Learning to Coordinate with Experts
by: Danesh, Mohamad H., et al.
Published: (2025)