Saved in:
| Main Authors: | Chen, Liang, Liu, Qi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.02228 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Position: The Turing-Completeness of Autoregressive Transformers Relies Heavily on Context Management
by: Cui, Guanyu, et al.
Published: (2026)
by: Cui, Guanyu, et al.
Published: (2026)
Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement
by: Liang, Haodong, et al.
Published: (2026)
by: Liang, Haodong, et al.
Published: (2026)
Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization
by: Zhu, Jiachen, et al.
Published: (2026)
by: Zhu, Jiachen, et al.
Published: (2026)
Efficient On-Device Agents via Adaptive Context Management
by: Vijayvargiya, Sanidhya, et al.
Published: (2025)
by: Vijayvargiya, Sanidhya, et al.
Published: (2025)
Offline Multi-Agent Reinforcement Learning via In-Sample Sequential Policy Optimization
by: Liu, Zongkai, et al.
Published: (2024)
by: Liu, Zongkai, et al.
Published: (2024)
Random Policy Enables In-Context Reinforcement Learning within Trust Horizons
by: Chen, Weiqin, et al.
Published: (2024)
by: Chen, Weiqin, et al.
Published: (2024)
PolicyLong: Towards On-Policy Context Extension
by: Jia, Junlong, et al.
Published: (2026)
by: Jia, Junlong, et al.
Published: (2026)
AgentFold: Long-Horizon Web Agents with Proactive Context Management
by: Ye, Rui, et al.
Published: (2025)
by: Ye, Rui, et al.
Published: (2025)
Paged Attention Meets FlexAttention: Unlocking Long-Context Efficiency in Deployed Inference
by: Joshi, Thomas, et al.
Published: (2025)
by: Joshi, Thomas, et al.
Published: (2025)
Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies
by: Lou, Zhanzhi, et al.
Published: (2026)
by: Lou, Zhanzhi, et al.
Published: (2026)
Policy and World Modeling Co-Training for Language Agents
by: Lu, Ning, et al.
Published: (2026)
by: Lu, Ning, et al.
Published: (2026)
DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management
by: Xie, Yaqi, et al.
Published: (2026)
by: Xie, Yaqi, et al.
Published: (2026)
Learning Long-Context Diffusion Policies via Past-Token Prediction
by: Torne, Marcel, et al.
Published: (2025)
by: Torne, Marcel, et al.
Published: (2025)
TypeBandit: Type-Level Context Allocation and Reweighting for Effective Attribute Completion in Heterogeneous Graph Neural Networks
by: Wang, Ta-Yang, et al.
Published: (2026)
by: Wang, Ta-Yang, et al.
Published: (2026)
Agent-Omit: Adaptive Context Omission for Efficient LLM Agents
by: Ning, Yansong, et al.
Published: (2026)
by: Ning, Yansong, et al.
Published: (2026)
Learning to Remember: End-to-End Training of Memory Agents for Long-Context Reasoning
by: Zhang, Kehao, et al.
Published: (2026)
by: Zhang, Kehao, et al.
Published: (2026)
Context Distillation as Latent Memory Management
by: Zheng, Ziyang, et al.
Published: (2026)
by: Zheng, Ziyang, et al.
Published: (2026)
The PokeAgent Challenge: Competitive and Long-Context Learning at Scale
by: Karten, Seth, et al.
Published: (2026)
by: Karten, Seth, et al.
Published: (2026)
Scalable In-Context Q-Learning
by: Liu, Jinmei, et al.
Published: (2025)
by: Liu, Jinmei, et al.
Published: (2025)
Clustering Context in Off-Policy Evaluation
by: Guzman-Olivares, Daniel, et al.
Published: (2025)
by: Guzman-Olivares, Daniel, et al.
Published: (2025)
MEMENTO: Teaching LLMs to Manage Their Own Context
by: Kontonis, Vasilis, et al.
Published: (2026)
by: Kontonis, Vasilis, et al.
Published: (2026)
Context Learning for Multi-Agent Discussion
by: Hua, Xingyuan, et al.
Published: (2026)
by: Hua, Xingyuan, et al.
Published: (2026)
Metric-Gradient Projection for Stable Multi-Agent Policy Learning
by: Zhang, Zuyuan, et al.
Published: (2026)
by: Zhang, Zuyuan, et al.
Published: (2026)
$K$-Level Policy Gradients for Multi-Agent Reinforcement Learning
by: Reddi, Aryaman, et al.
Published: (2025)
by: Reddi, Aryaman, et al.
Published: (2025)
Dual Turing Test: A Framework for Detecting and Mitigating Undetectable AI
by: Messina, Alberto
Published: (2025)
by: Messina, Alberto
Published: (2025)
Towards Lifecycle Unlearning Commitment Management: Measuring Sample-level Unlearning Completeness
by: Wang, Cheng-Long, et al.
Published: (2025)
by: Wang, Cheng-Long, et al.
Published: (2025)
Group-in-Group Policy Optimization for LLM Agent Training
by: Feng, Lang, et al.
Published: (2025)
by: Feng, Lang, et al.
Published: (2025)
Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning
by: Mu, Xianling, et al.
Published: (2025)
by: Mu, Xianling, et al.
Published: (2025)
Context and Diversity Matter: The Emergence of In-Context Learning in World Models
by: Wang, Fan, et al.
Published: (2025)
by: Wang, Fan, et al.
Published: (2025)
Descent-Guided Policy Gradient for Scalable Cooperative Multi-Agent Learning
by: Yang, Shan, et al.
Published: (2026)
by: Yang, Shan, et al.
Published: (2026)
Challenge on Optimization of Context Collection for Code Completion
by: Ustalov, Dmitry, et al.
Published: (2025)
by: Ustalov, Dmitry, et al.
Published: (2025)
Reinforcement Learning for Machine Learning Engineering Agents
by: Yang, Sherry, et al.
Published: (2025)
by: Yang, Sherry, et al.
Published: (2025)
PageRank Bandits for Link Prediction
by: Ban, Yikun, et al.
Published: (2024)
by: Ban, Yikun, et al.
Published: (2024)
Exact and Asymptotically Complete Robust Verifications of Neural Networks via Quantum Optimization
by: Li, Wenxin, et al.
Published: (2026)
by: Li, Wenxin, et al.
Published: (2026)
UniFluids: Unified Neural Operator Learning with Conditional Flow-matching
by: Li, Haosen, et al.
Published: (2026)
by: Li, Haosen, et al.
Published: (2026)
ContextEvolve: Multi-Agent Context Compression for Systems Code Optimization
by: Su, Hongyuan, et al.
Published: (2026)
by: Su, Hongyuan, et al.
Published: (2026)
ICLShield: Exploring and Mitigating In-Context Learning Backdoor Attacks
by: Ren, Zhiyao, et al.
Published: (2025)
by: Ren, Zhiyao, et al.
Published: (2025)
Mixture-of-Experts Meets In-Context Reinforcement Learning
by: Wu, Wenhao, et al.
Published: (2025)
by: Wu, Wenhao, et al.
Published: (2025)
Provable and Practical In-Context Policy Optimization for Self-Improvement
by: Yu, Tianrun, et al.
Published: (2026)
by: Yu, Tianrun, et al.
Published: (2026)
BlendRL: A Framework for Merging Symbolic and Neural Policy Learning
by: Shindo, Hikaru, et al.
Published: (2024)
by: Shindo, Hikaru, et al.
Published: (2024)
Similar Items
-
Position: The Turing-Completeness of Autoregressive Transformers Relies Heavily on Context Management
by: Cui, Guanyu, et al.
Published: (2026) -
Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement
by: Liang, Haodong, et al.
Published: (2026) -
Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization
by: Zhu, Jiachen, et al.
Published: (2026) -
Efficient On-Device Agents via Adaptive Context Management
by: Vijayvargiya, Sanidhya, et al.
Published: (2025) -
Offline Multi-Agent Reinforcement Learning via In-Sample Sequential Policy Optimization
by: Liu, Zongkai, et al.
Published: (2024)