:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Tianshi, Chen, Yuteng, Li, Meng
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2601.15141
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Imitation Learning from Purified Demonstrations
by: Wang, Yunke, et al.
Published: (2023)

Self-Distilled Agentic Reinforcement Learning
by: Lu, Zhengxi, et al.
Published: (2026)

DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
by: Li, Guanghe, et al.
Published: (2024)

Fairness Begins with State: Purifying Latent Preferences for Hierarchical Reinforcement Learning in Interactive Recommendation
by: Lu, Yun, et al.
Published: (2026)

Meta-Reinforcement Learning with Self-Reflection for Agentic Search
by: Xiao, Teng, et al.
Published: (2026)

Gradient Boosting Reinforcement Learning
by: Fuhrer, Benjamin, et al.
Published: (2024)

Latent Poincaré Shaping for Agentic Reinforcement Learning
by: Xia, Hanchen, et al.
Published: (2026)

TTVS: Boosting Self-Exploring Reinforcement Learning via Test-time Variational Synthesis
by: Bai, Sikai, et al.
Published: (2026)

Refine and Purify: Orthogonal Basis Optimization with Null-Space Denoising for Conditional Representation Learning
by: Wang, Jiaquan, et al.
Published: (2026)

In-Trajectory Inverse Reinforcement Learning: Learn Incrementally Before An Ongoing Trajectory Terminates
by: Liu, Shicheng, et al.
Published: (2024)

Offline Reinforcement Learning with Generative Trajectory Policies
by: Feng, Xinsong, et al.
Published: (2025)

TwinPurify: Purifying gene expression data to reveal tumor-intrinsic transcriptional programs via self-supervised learning
by: Zheng, Zhiwei, et al.
Published: (2026)

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
by: Qin, Yulei, et al.
Published: (2025)

Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?
by: Dai, Yang, et al.
Published: (2024)

TrajDeleter: Enabling Trajectory Forgetting in Offline Reinforcement Learning Agents
by: Gong, Chen, et al.
Published: (2024)

Offline Trajectory Optimization for Offline Reinforcement Learning
by: Zhao, Ziqi, et al.
Published: (2024)

The Trap of Trajectory: Towards Understanding and Mitigating Spurious Correlations in Agentic Memory
by: Tang, Luoxi, et al.
Published: (2026)

Generalizable Trajectory Prediction via Inverse Reinforcement Learning with Mamba-Graph Architecture
by: Li, Wenyun, et al.
Published: (2025)

Purifying Approximate Differential Privacy with Randomized Post-processing
by: Lin, Yingyu, et al.
Published: (2025)

Stable Continual Reinforcement Learning via Diffusion-based Trajectory Replay
by: Chen, Feng, et al.
Published: (2024)

Adaptive Rational Activations to Boost Deep Reinforcement Learning
by: Delfosse, Quentin, et al.
Published: (2021)

EARL: Efficient Agentic Reinforcement Learning Systems for Large Language Models
by: Tan, Zheyue, et al.
Published: (2025)

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization
by: Lu, Zhengxi, et al.
Published: (2026)

Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
by: Zhao, Yanxiao, et al.
Published: (2024)

AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs
by: Zheng, Haizhong, et al.
Published: (2026)

Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data
by: Jia, Zeyu, et al.
Published: (2024)

Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
by: Xu, Ran, et al.
Published: (2025)

HiGP: A high-performance Python package for Gaussian Process
by: Huang, Hua, et al.
Published: (2025)

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training
by: He, Zhongyu, et al.
Published: (2026)

MobileRL: Online Agentic Reinforcement Learning for Mobile GUI Agents
by: Xu, Yifan, et al.
Published: (2025)

Agentic Reinforcement Learning for Real-World Code Repair
by: Zhu, Siyu, et al.
Published: (2025)

Training One Model to Master Cross-Level Agentic Actions via Reinforcement Learning
by: He, Kaichen, et al.
Published: (2025)

Context-Enhanced Multi-View Trajectory Representation Learning: Bridging the Gap through Self-Supervised Models
by: Qian, Tangwen, et al.
Published: (2024)

Self-ReSET: Learning to Self-Recover from Unsafe Reasoning Trajectories
by: Zhang, Dongcheng, et al.
Published: (2026)

Know your Trajectory -- Trustworthy Reinforcement Learning deployment through Importance-Based Trajectory Analysis
by: F, Clifford, et al.
Published: (2025)

Enhancing Offline Reinforcement Learning with Curriculum Learning-Based Trajectory Valuation
by: Abolfazli, Amir, et al.
Published: (2025)

"No Matter What You Do": Purifying GNN Models via Backdoor Unlearning
by: Zhang, Jiale, et al.
Published: (2024)

Single-Trajectory Distributionally Robust Reinforcement Learning
by: Liang, Zhipeng, et al.
Published: (2023)

Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning
by: Shen, Junhao, et al.
Published: (2026)

Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner
by: Eschenhagen, Runa, et al.
Published: (2025)