Saved in:
| Main Authors: | Xu, Tianshi, Chen, Yuteng, Li, Meng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.15141 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Imitation Learning from Purified Demonstrations
by: Wang, Yunke, et al.
Published: (2023)
by: Wang, Yunke, et al.
Published: (2023)
Self-Distilled Agentic Reinforcement Learning
by: Lu, Zhengxi, et al.
Published: (2026)
by: Lu, Zhengxi, et al.
Published: (2026)
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
by: Li, Guanghe, et al.
Published: (2024)
by: Li, Guanghe, et al.
Published: (2024)
Fairness Begins with State: Purifying Latent Preferences for Hierarchical Reinforcement Learning in Interactive Recommendation
by: Lu, Yun, et al.
Published: (2026)
by: Lu, Yun, et al.
Published: (2026)
Meta-Reinforcement Learning with Self-Reflection for Agentic Search
by: Xiao, Teng, et al.
Published: (2026)
by: Xiao, Teng, et al.
Published: (2026)
Gradient Boosting Reinforcement Learning
by: Fuhrer, Benjamin, et al.
Published: (2024)
by: Fuhrer, Benjamin, et al.
Published: (2024)
Latent Poincaré Shaping for Agentic Reinforcement Learning
by: Xia, Hanchen, et al.
Published: (2026)
by: Xia, Hanchen, et al.
Published: (2026)
TTVS: Boosting Self-Exploring Reinforcement Learning via Test-time Variational Synthesis
by: Bai, Sikai, et al.
Published: (2026)
by: Bai, Sikai, et al.
Published: (2026)
Refine and Purify: Orthogonal Basis Optimization with Null-Space Denoising for Conditional Representation Learning
by: Wang, Jiaquan, et al.
Published: (2026)
by: Wang, Jiaquan, et al.
Published: (2026)
In-Trajectory Inverse Reinforcement Learning: Learn Incrementally Before An Ongoing Trajectory Terminates
by: Liu, Shicheng, et al.
Published: (2024)
by: Liu, Shicheng, et al.
Published: (2024)
Offline Reinforcement Learning with Generative Trajectory Policies
by: Feng, Xinsong, et al.
Published: (2025)
by: Feng, Xinsong, et al.
Published: (2025)
TwinPurify: Purifying gene expression data to reveal tumor-intrinsic transcriptional programs via self-supervised learning
by: Zheng, Zhiwei, et al.
Published: (2026)
by: Zheng, Zhiwei, et al.
Published: (2026)
Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
by: Qin, Yulei, et al.
Published: (2025)
by: Qin, Yulei, et al.
Published: (2025)
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?
by: Dai, Yang, et al.
Published: (2024)
by: Dai, Yang, et al.
Published: (2024)
TrajDeleter: Enabling Trajectory Forgetting in Offline Reinforcement Learning Agents
by: Gong, Chen, et al.
Published: (2024)
by: Gong, Chen, et al.
Published: (2024)
Offline Trajectory Optimization for Offline Reinforcement Learning
by: Zhao, Ziqi, et al.
Published: (2024)
by: Zhao, Ziqi, et al.
Published: (2024)
The Trap of Trajectory: Towards Understanding and Mitigating Spurious Correlations in Agentic Memory
by: Tang, Luoxi, et al.
Published: (2026)
by: Tang, Luoxi, et al.
Published: (2026)
Generalizable Trajectory Prediction via Inverse Reinforcement Learning with Mamba-Graph Architecture
by: Li, Wenyun, et al.
Published: (2025)
by: Li, Wenyun, et al.
Published: (2025)
Purifying Approximate Differential Privacy with Randomized Post-processing
by: Lin, Yingyu, et al.
Published: (2025)
by: Lin, Yingyu, et al.
Published: (2025)
Stable Continual Reinforcement Learning via Diffusion-based Trajectory Replay
by: Chen, Feng, et al.
Published: (2024)
by: Chen, Feng, et al.
Published: (2024)
Adaptive Rational Activations to Boost Deep Reinforcement Learning
by: Delfosse, Quentin, et al.
Published: (2021)
by: Delfosse, Quentin, et al.
Published: (2021)
EARL: Efficient Agentic Reinforcement Learning Systems for Large Language Models
by: Tan, Zheyue, et al.
Published: (2025)
by: Tan, Zheyue, et al.
Published: (2025)
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization
by: Lu, Zhengxi, et al.
Published: (2026)
by: Lu, Zhengxi, et al.
Published: (2026)
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
by: Zhao, Yanxiao, et al.
Published: (2024)
by: Zhao, Yanxiao, et al.
Published: (2024)
AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs
by: Zheng, Haizhong, et al.
Published: (2026)
by: Zheng, Haizhong, et al.
Published: (2026)
Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data
by: Jia, Zeyu, et al.
Published: (2024)
by: Jia, Zeyu, et al.
Published: (2024)
Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
by: Xu, Ran, et al.
Published: (2025)
by: Xu, Ran, et al.
Published: (2025)
HiGP: A high-performance Python package for Gaussian Process
by: Huang, Hua, et al.
Published: (2025)
by: Huang, Hua, et al.
Published: (2025)
SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training
by: He, Zhongyu, et al.
Published: (2026)
by: He, Zhongyu, et al.
Published: (2026)
MobileRL: Online Agentic Reinforcement Learning for Mobile GUI Agents
by: Xu, Yifan, et al.
Published: (2025)
by: Xu, Yifan, et al.
Published: (2025)
Agentic Reinforcement Learning for Real-World Code Repair
by: Zhu, Siyu, et al.
Published: (2025)
by: Zhu, Siyu, et al.
Published: (2025)
Training One Model to Master Cross-Level Agentic Actions via Reinforcement Learning
by: He, Kaichen, et al.
Published: (2025)
by: He, Kaichen, et al.
Published: (2025)
Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models
by: Qian, Tangwen, et al.
Published: (2024)
by: Qian, Tangwen, et al.
Published: (2024)
Self-ReSET: Learning to Self-Recover from Unsafe Reasoning Trajectories
by: Zhang, Dongcheng, et al.
Published: (2026)
by: Zhang, Dongcheng, et al.
Published: (2026)
Know your Trajectory -- Trustworthy Reinforcement Learning deployment through Importance-Based Trajectory Analysis
by: F, Clifford, et al.
Published: (2025)
by: F, Clifford, et al.
Published: (2025)
Enhancing Offline Reinforcement Learning with Curriculum Learning-Based Trajectory Valuation
by: Abolfazli, Amir, et al.
Published: (2025)
by: Abolfazli, Amir, et al.
Published: (2025)
"No Matter What You Do": Purifying GNN Models via Backdoor Unlearning
by: Zhang, Jiale, et al.
Published: (2024)
by: Zhang, Jiale, et al.
Published: (2024)
Single-Trajectory Distributionally Robust Reinforcement Learning
by: Liang, Zhipeng, et al.
Published: (2023)
by: Liang, Zhipeng, et al.
Published: (2023)
Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning
by: Shen, Junhao, et al.
Published: (2026)
by: Shen, Junhao, et al.
Published: (2026)
Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
by: Eschenhagen, Runa, et al.
Published: (2025)
by: Eschenhagen, Runa, et al.
Published: (2025)
Similar Items
-
Imitation Learning from Purified Demonstrations
by: Wang, Yunke, et al.
Published: (2023) -
Self-Distilled Agentic Reinforcement Learning
by: Lu, Zhengxi, et al.
Published: (2026) -
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
by: Li, Guanghe, et al.
Published: (2024) -
Fairness Begins with State: Purifying Latent Preferences for Hierarchical Reinforcement Learning in Interactive Recommendation
by: Lu, Yun, et al.
Published: (2026) -
Meta-Reinforcement Learning with Self-Reflection for Agentic Search
by: Xiao, Teng, et al.
Published: (2026)