Saved in:
| Main Authors: | Wen, Jianyu, Wei, Yang, Yu, Xiongxi, Xiao, Changxuan, Zeng, Ke |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.16596 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models
by: Cao, Jie, et al.
Published: (2025)
by: Cao, Jie, et al.
Published: (2025)
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
by: Wang, Kuan-Chieh, et al.
Published: (2024)
by: Wang, Kuan-Chieh, et al.
Published: (2024)
Mixture-of-Depths Attention
by: Zhu, Lianghui, et al.
Published: (2026)
by: Zhu, Lianghui, et al.
Published: (2026)
Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference
by: Khaled, Arindam
Published: (2026)
by: Khaled, Arindam
Published: (2026)
MoBA: Mixture of Block Attention for Long-Context LLMs
by: Lu, Enzhe, et al.
Published: (2025)
by: Lu, Enzhe, et al.
Published: (2025)
Mixture of Attentions For Speculative Decoding
by: Zimmer, Matthieu, et al.
Published: (2024)
by: Zimmer, Matthieu, et al.
Published: (2024)
Depth-Recurrent Attention Mixtures: Giving Latent Reasoning the Attention it Deserves
by: Knupp, Jonas, et al.
Published: (2026)
by: Knupp, Jonas, et al.
Published: (2026)
ReachAgent: Enhancing Mobile Agent via Page Reaching and Operation
by: Wu, Qinzhuo, et al.
Published: (2025)
by: Wu, Qinzhuo, et al.
Published: (2025)
Yuan 2.0-M32: Mixture of Experts with Attention Router
by: Wu, Shaohua, et al.
Published: (2024)
by: Wu, Shaohua, et al.
Published: (2024)
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
by: Wei, Tianwen, et al.
Published: (2024)
by: Wei, Tianwen, et al.
Published: (2024)
HCAttention: Extreme KV Cache Compression via Heterogeneous Attention Computing for LLMs
by: Yang, Dongquan, et al.
Published: (2025)
by: Yang, Dongquan, et al.
Published: (2025)
AgentRefine: Enhancing Agent Generalization through Refinement Tuning
by: Fu, Dayuan, et al.
Published: (2025)
by: Fu, Dayuan, et al.
Published: (2025)
Multi-granularity Interactive Attention Framework for Residual Hierarchical Pronunciation Assessment
by: Han, Hong, et al.
Published: (2026)
by: Han, Hong, et al.
Published: (2026)
Interpretable Emergent Language Using Inter-Agent Transformers
by: Bhardwaj, Mannan
Published: (2025)
by: Bhardwaj, Mannan
Published: (2025)
Mixture-of-Minds: Multi-Agent Reinforcement Learning for Table Understanding
by: Zhou, Yuhang, et al.
Published: (2025)
by: Zhou, Yuhang, et al.
Published: (2025)
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks
by: Li, Minghao, et al.
Published: (2025)
by: Li, Minghao, et al.
Published: (2025)
Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis
by: Wen, Hongbo, et al.
Published: (2026)
by: Wen, Hongbo, et al.
Published: (2026)
DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning
by: Huang, Haoyu, et al.
Published: (2026)
by: Huang, Haoyu, et al.
Published: (2026)
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
by: Song, Yifan, et al.
Published: (2024)
by: Song, Yifan, et al.
Published: (2024)
SR-KI: Scalable and Real-Time Knowledge Integration into LLMs via Supervised Attention
by: Yu, Bohan, et al.
Published: (2025)
by: Yu, Bohan, et al.
Published: (2025)
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
by: Yuan, Jingyang, et al.
Published: (2025)
by: Yuan, Jingyang, et al.
Published: (2025)
GEM: Graph-Enhanced Mixture-of-Experts with ReAct Agents for Dialogue State Tracking
by: Zhu, Ziqi, et al.
Published: (2026)
by: Zhu, Ziqi, et al.
Published: (2026)
BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism
by: Wu, Qinzhuo, et al.
Published: (2025)
by: Wu, Qinzhuo, et al.
Published: (2025)
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing
by: Gao, Yizhao, et al.
Published: (2026)
by: Gao, Yizhao, et al.
Published: (2026)
From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench
by: Xu, Ke, et al.
Published: (2026)
by: Xu, Ke, et al.
Published: (2026)
DeepEra: A Deep Evidence Reranking Agent for Scientific Retrieval-Augmented Generated Question Answering
by: Chen, Haotian, et al.
Published: (2026)
by: Chen, Haotian, et al.
Published: (2026)
Pre-Attention Expert Prediction and Prefetching for Mixture-of-Experts Large Language Models
by: Zhu, Shien, et al.
Published: (2025)
by: Zhu, Shien, et al.
Published: (2025)
Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias
by: Ke, Yu He, et al.
Published: (2024)
by: Ke, Yu He, et al.
Published: (2024)
DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping
by: Fan, Wei, et al.
Published: (2025)
by: Fan, Wei, et al.
Published: (2025)
Explicit Multi-head Attention for Inter-head Interaction in Large Language Models
by: Peng, Runyu, et al.
Published: (2026)
by: Peng, Runyu, et al.
Published: (2026)
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence
by: Dong, Guanting, et al.
Published: (2026)
by: Dong, Guanting, et al.
Published: (2026)
ChinaTravel: An Open-Ended Travel Planning Benchmark with Compositional Constraint Validation for Language Agents
by: Shao, Jie-Jing, et al.
Published: (2024)
by: Shao, Jie-Jing, et al.
Published: (2024)
Attention-guided Evidence Grounding for Spoken Question Answering
by: Yang, Ke, et al.
Published: (2026)
by: Yang, Ke, et al.
Published: (2026)
Positional Encoding via Token-Aware Phase Attention
by: Wang, Yu, et al.
Published: (2025)
by: Wang, Yu, et al.
Published: (2025)
MoKA: Mixture of Kronecker Adapters
by: Sadeghi, Mohammadreza, et al.
Published: (2025)
by: Sadeghi, Mohammadreza, et al.
Published: (2025)
The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement
by: Yang, Ruihan, et al.
Published: (2025)
by: Yang, Ruihan, et al.
Published: (2025)
Deep Reasoning in General Purpose Agents via Structured Meta-Cognition
by: Light, Dean, et al.
Published: (2026)
by: Light, Dean, et al.
Published: (2026)
CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis
by: Wei, Anjiang, et al.
Published: (2025)
by: Wei, Anjiang, et al.
Published: (2025)
Attention Editing: A Versatile Framework for Cross-Architecture Attention Conversion
by: Cheng, Zhen, et al.
Published: (2026)
by: Cheng, Zhen, et al.
Published: (2026)
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
by: Zhou, Ruiwen, et al.
Published: (2024)
by: Zhou, Ruiwen, et al.
Published: (2024)
Similar Items
-
MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models
by: Cao, Jie, et al.
Published: (2025) -
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
by: Wang, Kuan-Chieh, et al.
Published: (2024) -
Mixture-of-Depths Attention
by: Zhu, Lianghui, et al.
Published: (2026) -
Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference
by: Khaled, Arindam
Published: (2026) -
MoBA: Mixture of Block Attention for Long-Context LLMs
by: Lu, Enzhe, et al.
Published: (2025)