Saved in:
| Main Authors: | Wang, Haixin, Cui, Hejie, Zhang, Chenwei, Liu, Xin, Jin, Shuowei, Geng, Shijie, Zhang, Xinyang, Zalmout, Nasser, Shi, Zhenyu, Sun, Yizhou |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.02178 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CoMem: Context Management with A Decoupled Long-Context Model
by: Zhang, Yuwei, et al.
Published: (2026)
by: Zhang, Yuwei, et al.
Published: (2026)
WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance
by: Liu, Genglin, et al.
Published: (2025)
by: Liu, Genglin, et al.
Published: (2025)
ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning
by: Wang, Xiaoxuan, et al.
Published: (2026)
by: Wang, Xiaoxuan, et al.
Published: (2026)
Alternating Reinforcement Learning with Contextual Rubric Rewards: Beyond the Scalarization Strategy
by: Lan, Guangchen, et al.
Published: (2026)
by: Lan, Guangchen, et al.
Published: (2026)
ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer
by: Deng, Chunyuan, et al.
Published: (2026)
by: Deng, Chunyuan, et al.
Published: (2026)
HeaPA: Difficulty-Aware Heap Sampling and On-Policy Query Augmentation for LLM Reinforcement Learning
by: Wang, Weiqi, et al.
Published: (2026)
by: Wang, Weiqi, et al.
Published: (2026)
AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs
by: Zheng, Haizhong, et al.
Published: (2026)
by: Zheng, Haizhong, et al.
Published: (2026)
AT$^2$PO: Agentic Turn-based Policy Optimization via Tree Search
by: Zong, Zefang, et al.
Published: (2026)
by: Zong, Zefang, et al.
Published: (2026)
Microstructures and Accuracy of Graph Recall by Large Language Models
by: Wang, Yanbang, et al.
Published: (2024)
by: Wang, Yanbang, et al.
Published: (2024)
AgentRL: Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework
by: Zhang, Hanchen, et al.
Published: (2025)
by: Zhang, Hanchen, et al.
Published: (2025)
AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning
by: Zhao, Haotian, et al.
Published: (2026)
by: Zhao, Haotian, et al.
Published: (2026)
Self-Guided Diffusion Model for Accelerating Computational Fluid Dynamics
by: Li, Ruoyan, et al.
Published: (2025)
by: Li, Ruoyan, et al.
Published: (2025)
Compute Or Load KV Cache? Why Not Both?
by: Jin, Shuowei, et al.
Published: (2024)
by: Jin, Shuowei, et al.
Published: (2024)
SegTest: Metamorphic Testing of Image Segmentation via Guided Instance‐Level Test Data Augmentation
by: Zhonghao Hou, et al.
Published: (2024)
by: Zhonghao Hou, et al.
Published: (2024)
Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction
by: Chen, Xingwu, et al.
Published: (2026)
by: Chen, Xingwu, et al.
Published: (2026)
Logic of (Common or Distributed) Knowledge
by: Shi, Chenwei
Published: (2025)
by: Shi, Chenwei
Published: (2025)
Eagle: Efficient Training-Free Router for Multi-LLM Inference
by: Zhao, Zesen, et al.
Published: (2024)
by: Zhao, Zesen, et al.
Published: (2024)
The Efficiency Frontier: Classical Shadows versus Quantum Footage
by: Ma, Shuowei, et al.
Published: (2025)
by: Ma, Shuowei, et al.
Published: (2025)
HSTFL: A Heterogeneous Federated Learning Framework for Misaligned Spatiotemporal Forecasting
by: Cai, Shuowei, et al.
Published: (2024)
by: Cai, Shuowei, et al.
Published: (2024)
Enhancing Cryo-EM Density Map Segmentation in Phenix for Improved Atomic Model Building
by: Zhang, Chenwei
Published: (2026)
by: Zhang, Chenwei
Published: (2026)
Applications of deep generative models to DNA reaction kinetics and to cryogenic electron microscopy
by: Zhang, Chenwei
Published: (2026)
by: Zhang, Chenwei
Published: (2026)
GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning
by: Yan, Haolong, et al.
Published: (2025)
by: Yan, Haolong, et al.
Published: (2025)
Uncertainty-Aware Data-Based Method for Fast and Reliable Shape Optimization
by: Yang, Yunjia, et al.
Published: (2026)
by: Yang, Yunjia, et al.
Published: (2026)
Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling
by: Yan, Shiqi, et al.
Published: (2026)
by: Yan, Shiqi, et al.
Published: (2026)
StepPO: Step-Aligned Policy Optimization for Agentic Reinforcement Learning
by: Wang, Daoyu, et al.
Published: (2026)
by: Wang, Daoyu, et al.
Published: (2026)
ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
by: Zhou, Ruiyang, et al.
Published: (2025)
by: Zhou, Ruiyang, et al.
Published: (2025)
SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence
by: Zhang, Yao, et al.
Published: (2025)
by: Zhang, Yao, et al.
Published: (2025)
Phylogenomics Resolves Deep Phylogenetic Uncertainties and Hybridization in Rapidly Radiated Horseshoe Bats ( Rhinolophus )
by: Shanxiu Yang, et al.
Published: (2025)
by: Shanxiu Yang, et al.
Published: (2025)
Optimal qudit overlapping tomography and optimal measurement order
by: Ma, Shuowei, et al.
Published: (2026)
by: Ma, Shuowei, et al.
Published: (2026)
Controllable and Verifiable Tool-Use Data Synthesis for Agentic Reinforcement Learning
by: Xu, Siyuan, et al.
Published: (2026)
by: Xu, Siyuan, et al.
Published: (2026)
DocTalk: Scalable Graph-based Dialogue Synthesis for Enhancing LLM Conversational Capabilities
by: Lee, Jing Yang, et al.
Published: (2025)
by: Lee, Jing Yang, et al.
Published: (2025)
Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning
by: Shi, Chengshuai, et al.
Published: (2026)
by: Shi, Chengshuai, et al.
Published: (2026)
CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use
by: Zhang, Zhen, et al.
Published: (2026)
by: Zhang, Zhen, et al.
Published: (2026)
Preference-Guided Reinforcement Learning for Efficient Exploration
by: Wang, Guojian, et al.
Published: (2024)
by: Wang, Guojian, et al.
Published: (2024)
Contact-Rich and Deformable Foot Modeling for Locomotion Control of the Human Musculoskeletal System
by: Gong, Haixin, et al.
Published: (2025)
by: Gong, Haixin, et al.
Published: (2025)
ORL-LDM: Offline Reinforcement Learning Guided Latent Diffusion Model Super-Resolution Reconstruction
by: Lyu, Shijie
Published: (2025)
by: Lyu, Shijie
Published: (2025)
OGER: A Robust Offline-Guided Exploration Reward for Hybrid Reinforcement Learning
by: Ma, Xinyu, et al.
Published: (2026)
by: Ma, Xinyu, et al.
Published: (2026)
Biomedical Visual Instruction Tuning with Clinician Preference Alignment
by: Cui, Hejie, et al.
Published: (2024)
by: Cui, Hejie, et al.
Published: (2024)
Train a Unified Multimodal Data Quality Classifier with Synthetic Data
by: Wang, Weizhi, et al.
Published: (2025)
by: Wang, Weizhi, et al.
Published: (2025)
Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
by: Qin, Yulei, et al.
Published: (2025)
by: Qin, Yulei, et al.
Published: (2025)
Similar Items
-
CoMem: Context Management with A Decoupled Long-Context Model
by: Zhang, Yuwei, et al.
Published: (2026) -
WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance
by: Liu, Genglin, et al.
Published: (2025) -
ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning
by: Wang, Xiaoxuan, et al.
Published: (2026) -
Alternating Reinforcement Learning with Contextual Rubric Rewards: Beyond the Scalarization Strategy
by: Lan, Guangchen, et al.
Published: (2026) -
ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer
by: Deng, Chunyuan, et al.
Published: (2026)