Saved in:
| Main Authors: | Bai, Sikai, Li, Haoxi, Zhang, Jie, Liu, Yongjiang, Guo, Song |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.08468 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs
by: Li, Haoxi, et al.
Published: (2025)
by: Li, Haoxi, et al.
Published: (2025)
Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models
by: Liu, Yongjiang, et al.
Published: (2025)
by: Liu, Yongjiang, et al.
Published: (2025)
Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators
by: Bai, Sikai, et al.
Published: (2023)
by: Bai, Sikai, et al.
Published: (2023)
SelfBC: Self Behavior Cloning for Offline Reinforcement Learning
by: Liu, Shirong, et al.
Published: (2024)
by: Liu, Shirong, et al.
Published: (2024)
DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning
by: Bai, Sikai, et al.
Published: (2024)
by: Bai, Sikai, et al.
Published: (2024)
Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards
by: Fang, Linghan, et al.
Published: (2026)
by: Fang, Linghan, et al.
Published: (2026)
Black-box Gradient Attack on Graph Neural Networks: Deeper Insights in Graph-based Attack and Defense
by: Zhan, Haoxi, et al.
Published: (2021)
by: Zhan, Haoxi, et al.
Published: (2021)
HAIM-DRL: Enhanced Human-in-the-loop Reinforcement Learning for Safe and Efficient Autonomous Driving
by: Huang, Zilin, et al.
Published: (2024)
by: Huang, Zilin, et al.
Published: (2024)
What You Think is What You See: Driving Exploration in VLM Agents via Visual-Linguistic Curiosity
by: Li, Haoxi, et al.
Published: (2026)
by: Li, Haoxi, et al.
Published: (2026)
Trustworthy Human-AI Collaboration: Reinforcement Learning with Human Feedback and Physics Knowledge for Safe Autonomous Driving
by: Huang, Zilin, et al.
Published: (2024)
by: Huang, Zilin, et al.
Published: (2024)
Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning
by: Wang, Ru, et al.
Published: (2025)
by: Wang, Ru, et al.
Published: (2025)
Boosting Maximum Entropy Reinforcement Learning via One-Step Flow Matching
by: Li, Zeqiao, et al.
Published: (2026)
by: Li, Zeqiao, et al.
Published: (2026)
Self-Reinforcing Controllable Synthesis of Rare Relational Data via Bayesian Calibration
by: Zhang, Chongsheng, et al.
Published: (2026)
by: Zhang, Chongsheng, et al.
Published: (2026)
Easy Samples Are All You Need: Self-Evolving LLMs via Data-Efficient Reinforcement Learning
by: Yu, Zhiyin, et al.
Published: (2026)
by: Yu, Zhiyin, et al.
Published: (2026)
Gradient Boosting Reinforcement Learning
by: Fuhrer, Benjamin, et al.
Published: (2024)
by: Fuhrer, Benjamin, et al.
Published: (2024)
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions
by: Yang, Rui, et al.
Published: (2024)
by: Yang, Rui, et al.
Published: (2024)
Learning to Inject: Automated Prompt Injection via Reinforcement Learning
by: Chen, Xin, et al.
Published: (2026)
by: Chen, Xin, et al.
Published: (2026)
VERIRL: Boosting the LLM-based Verilog Code Generation via Reinforcement Learning
by: Teng, Fu, et al.
Published: (2025)
by: Teng, Fu, et al.
Published: (2025)
Self Paced Gaussian Contextual Reinforcement Learning
by: Ardakani, Mohsen Sahraei, et al.
Published: (2026)
by: Ardakani, Mohsen Sahraei, et al.
Published: (2026)
Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement Learning
by: Wu, Qingyuan, et al.
Published: (2025)
by: Wu, Qingyuan, et al.
Published: (2025)
Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning
by: Gu, Yuxuan, et al.
Published: (2025)
by: Gu, Yuxuan, et al.
Published: (2025)
Reinforcement Learning via Self-Distillation
by: Hübotter, Jonas, et al.
Published: (2026)
by: Hübotter, Jonas, et al.
Published: (2026)
Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback
by: Yang, Yutao, et al.
Published: (2025)
by: Yang, Yutao, et al.
Published: (2025)
ECHO: Entropy-Confidence Hybrid Optimization for Test-Time Reinforcement Learning
by: Zhao, Chu, et al.
Published: (2026)
by: Zhao, Chu, et al.
Published: (2026)
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
by: Li, Guanghe, et al.
Published: (2024)
by: Li, Guanghe, et al.
Published: (2024)
Boosting long-term forecasting performance for continuous-time dynamic graph networks via data augmentation
by: Tian, Yuxing, et al.
Published: (2023)
by: Tian, Yuxing, et al.
Published: (2023)
Traffic expertise meets residual RL: Knowledge-informed model-based residual reinforcement learning for CAV trajectory control
by: Sheng, Zihao, et al.
Published: (2024)
by: Sheng, Zihao, et al.
Published: (2024)
Boosting Sample Efficiency and Generalization in Multi-agent Reinforcement Learning via Equivariance
by: McClellan, Joshua, et al.
Published: (2024)
by: McClellan, Joshua, et al.
Published: (2024)
Deep Reinforcement Learning Guided Improvement Heuristic for Job Shop Scheduling
by: Zhang, Cong, et al.
Published: (2022)
by: Zhang, Cong, et al.
Published: (2022)
Meta-Cognitive Reinforcement Learning with Self-Doubt and Recovery
by: Zhang, Zhipeng, et al.
Published: (2026)
by: Zhang, Zhipeng, et al.
Published: (2026)
A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges
by: Qing, Yunpeng, et al.
Published: (2022)
by: Qing, Yunpeng, et al.
Published: (2022)
Test-Time Meta-Adaptation with Self-Synthesis
by: Kaya, Zeyneb N., et al.
Published: (2026)
by: Kaya, Zeyneb N., et al.
Published: (2026)
Learning to Explore with Parameter-Space Noise: A Deep Dive into Parameter-Space Noise for Reinforcement Learning with Verifiable Rewards
by: Bai, Bizhe, et al.
Published: (2026)
by: Bai, Bizhe, et al.
Published: (2026)
RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$
by: Bhatia, Abhinav, et al.
Published: (2023)
by: Bhatia, Abhinav, et al.
Published: (2023)
Online Boosting Adaptive Learning under Concept Drift for Multistream Classification
by: Yu, En, et al.
Published: (2023)
by: Yu, En, et al.
Published: (2023)
Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search
by: Sokota, Samuel, et al.
Published: (2025)
by: Sokota, Samuel, et al.
Published: (2025)
Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning
by: Han, Seungyub, et al.
Published: (2026)
by: Han, Seungyub, et al.
Published: (2026)
MathMixup: Boosting LLM Mathematical Reasoning with Difficulty-Controllable Data Synthesis and Curriculum Learning
by: Li, Xuchen, et al.
Published: (2026)
by: Li, Xuchen, et al.
Published: (2026)
KAN v.s. MLP for Offline Reinforcement Learning
by: Guo, Haihong, et al.
Published: (2024)
by: Guo, Haihong, et al.
Published: (2024)
TraPO: A Semi-Supervised Reinforcement Learning Framework for Boosting LLM Reasoning
by: Yang, Shenzhi, et al.
Published: (2025)
by: Yang, Shenzhi, et al.
Published: (2025)
Similar Items
-
CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs
by: Li, Haoxi, et al.
Published: (2025) -
Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models
by: Liu, Yongjiang, et al.
Published: (2025) -
Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators
by: Bai, Sikai, et al.
Published: (2023) -
SelfBC: Self Behavior Cloning for Offline Reinforcement Learning
by: Liu, Shirong, et al.
Published: (2024) -
DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning
by: Bai, Sikai, et al.
Published: (2024)