Saved in:
| Main Authors: | Zhao, Yanxiao, Qian, Yangge, Shan, Jingyang, Qin, Xiaolin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.11896 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
by: Zhao, Yanxiao, et al.
Published: (2024)
by: Zhao, Yanxiao, et al.
Published: (2024)
Neuro-symbolic Action Masking for Deep Reinforcement Learning
by: Han, Shuai, et al.
Published: (2026)
by: Han, Shuai, et al.
Published: (2026)
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
by: Zhao, Yanxiao, et al.
Published: (2025)
by: Zhao, Yanxiao, et al.
Published: (2025)
Counterfactual Explanations for Continuous Action Reinforcement Learning
by: Dong, Shuyang, et al.
Published: (2025)
by: Dong, Shuyang, et al.
Published: (2025)
Discretizing Continuous Action Space with Unimodal Probability Distributions for On-Policy Reinforcement Learning
by: Zhu, Yuanyang, et al.
Published: (2024)
by: Zhu, Yuanyang, et al.
Published: (2024)
On the Geometry of Reinforcement Learning in Continuous State and Action Spaces
by: Tiwari, Saket, et al.
Published: (2022)
by: Tiwari, Saket, et al.
Published: (2022)
CRL-VLA: Continual Vision-Language-Action Learning
by: Zeng, Qixin, et al.
Published: (2026)
by: Zeng, Qixin, et al.
Published: (2026)
Action-Adaptive Continual Learning: Enabling Policy Generalization under Dynamic Action Spaces
by: Pan, Chaofan, et al.
Published: (2025)
by: Pan, Chaofan, et al.
Published: (2025)
Heuristic Algorithm-based Action Masking Reinforcement Learning (HAAM-RL) with Ensemble Inference Method
by: Choi, Kyuwon, et al.
Published: (2024)
by: Choi, Kyuwon, et al.
Published: (2024)
Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces
by: Tiwari, Saket, et al.
Published: (2025)
by: Tiwari, Saket, et al.
Published: (2025)
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
by: Zhao, Hanyang, et al.
Published: (2025)
by: Zhao, Hanyang, et al.
Published: (2025)
CAMEL-CLIP: Channel-aware Multimodal Electroencephalography-text Alignment for Generalizable Brain Foundation Models
by: Choi, Hanseul, et al.
Published: (2026)
by: Choi, Hanseul, et al.
Published: (2026)
Provably Efficient Action-Manipulation Attack Against Continuous Reinforcement Learning
by: Luo, Zhi, et al.
Published: (2024)
by: Luo, Zhi, et al.
Published: (2024)
DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents
by: Zhao, Jiahao, et al.
Published: (2026)
by: Zhao, Jiahao, et al.
Published: (2026)
Action Mapping for Reinforcement Learning in Continuous Environments with Constraints
by: Theile, Mirco, et al.
Published: (2024)
by: Theile, Mirco, et al.
Published: (2024)
Large Continual Instruction Assistant
by: Qiao, Jingyang, et al.
Published: (2024)
by: Qiao, Jingyang, et al.
Published: (2024)
On Predictability of Reinforcement Learning Dynamics for Large Language Models
by: Cai, Yuchen, et al.
Published: (2025)
by: Cai, Yuchen, et al.
Published: (2025)
Discovering Reinforcement Learning Interfaces with Large Language Models
by: Jaswal, Akshat Singh, et al.
Published: (2026)
by: Jaswal, Akshat Singh, et al.
Published: (2026)
Reinforcement Learning with Promising Tokens for Large Language Models
by: Pang, Jing-Cheng, et al.
Published: (2026)
by: Pang, Jing-Cheng, et al.
Published: (2026)
Integration of Large Language Models and Federated Learning
by: Chen, Chaochao, et al.
Published: (2023)
by: Chen, Chaochao, et al.
Published: (2023)
Inverse Reinforcement Learning with Switching Rewards and History Dependency for Characterizing Animal Behaviors
by: Ke, Jingyang, et al.
Published: (2025)
by: Ke, Jingyang, et al.
Published: (2025)
Continual Learning of Large Language Models: A Comprehensive Survey
by: Shi, Haizhou, et al.
Published: (2024)
by: Shi, Haizhou, et al.
Published: (2024)
Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning
by: Ma, Hao, et al.
Published: (2025)
by: Ma, Hao, et al.
Published: (2025)
STABLE: Gated Continual Learning for Large Language Models
by: Hoy, William, et al.
Published: (2025)
by: Hoy, William, et al.
Published: (2025)
Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models
by: Huang, Shiting, et al.
Published: (2026)
by: Huang, Shiting, et al.
Published: (2026)
An Advantage-based Optimization Method for Reinforcement Learning in Large Action Space
by: Lin, Hai, et al.
Published: (2024)
by: Lin, Hai, et al.
Published: (2024)
Trust Region Masking for Long-Horizon LLM Reinforcement Learning
by: Li, Yingru, et al.
Published: (2025)
by: Li, Yingru, et al.
Published: (2025)
Model-based Reinforcement Learning for Parameterized Action Spaces
by: Zhang, Renhao, et al.
Published: (2024)
by: Zhang, Renhao, et al.
Published: (2024)
Enhancing Reinforcement Learning Fine-Tuning with an Online Refiner
by: Ma, Hao, et al.
Published: (2026)
by: Ma, Hao, et al.
Published: (2026)
Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model
by: Tu, Songjun, et al.
Published: (2024)
by: Tu, Songjun, et al.
Published: (2024)
Efficient Reinforcement Learning for Large Language Models with Intrinsic Exploration
by: Sun, Yan, et al.
Published: (2025)
by: Sun, Yan, et al.
Published: (2025)
Offline Regularised Reinforcement Learning for Large Language Models Alignment
by: Richemond, Pierre Harvey, et al.
Published: (2024)
by: Richemond, Pierre Harvey, et al.
Published: (2024)
Generalizable Multimodal Large Language Model Editing via Invariant Trajectory Learning
by: Su, Jiajie, et al.
Published: (2026)
by: Su, Jiajie, et al.
Published: (2026)
Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning
by: Liu, Huihan, et al.
Published: (2026)
by: Liu, Huihan, et al.
Published: (2026)
RLAX: Large-Scale, Distributed Reinforcement Learning for Large Language Models on TPUs
by: Zhou, Runlong, et al.
Published: (2025)
by: Zhou, Runlong, et al.
Published: (2025)
Test-driven Reinforcement Learning in Continuous Control
by: Yu, Zhao, et al.
Published: (2025)
by: Yu, Zhao, et al.
Published: (2025)
Routing-Based Continual Learning for Multimodal Large Language Models
by: Mohta, Jay, et al.
Published: (2025)
by: Mohta, Jay, et al.
Published: (2025)
Diffusion Guided Adversarial State Perturbations in Reinforcement Learning
by: Sun, Xiaolin, et al.
Published: (2025)
by: Sun, Xiaolin, et al.
Published: (2025)
Breaking the Grid: Distance-Guided Reinforcement Learning in Large Discrete Action Spaces
by: Hoppe, Heiko, et al.
Published: (2026)
by: Hoppe, Heiko, et al.
Published: (2026)
Mitigating Overthinking in Large Reasoning Models via Difficulty-aware Reinforcement Learning
by: Wan, Qian, et al.
Published: (2026)
by: Wan, Qian, et al.
Published: (2026)
Similar Items
-
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
by: Zhao, Yanxiao, et al.
Published: (2024) -
Neuro-symbolic Action Masking for Deep Reinforcement Learning
by: Han, Shuai, et al.
Published: (2026) -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
by: Zhao, Yanxiao, et al.
Published: (2025) -
Counterfactual Explanations for Continuous Action Reinforcement Learning
by: Dong, Shuyang, et al.
Published: (2025) -
Discretizing Continuous Action Space with Unimodal Probability Distributions for On-Policy Reinforcement Learning
by: Zhu, Yuanyang, et al.
Published: (2024)