Saved in:
| Main Authors: | Yu, Hao, Jia, Shuning, Li, Guanghao, Jiang, Wenhao, Yuan, Chun |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.22703 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use
by: Wu, Mingyuan, et al.
Published: (2025)
by: Wu, Mingyuan, et al.
Published: (2025)
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
by: Wang, Xiyao, et al.
Published: (2025)
by: Wang, Xiyao, et al.
Published: (2025)
Code as Reward: Empowering Reinforcement Learning with VLMs
by: Venuto, David, et al.
Published: (2024)
by: Venuto, David, et al.
Published: (2024)
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning
by: Liang, Zhenwen, et al.
Published: (2025)
by: Liang, Zhenwen, et al.
Published: (2025)
Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
by: Ghosh, Udita, et al.
Published: (2025)
by: Ghosh, Udita, et al.
Published: (2025)
Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning
by: Shi, Chengshuai, et al.
Published: (2026)
by: Shi, Chengshuai, et al.
Published: (2026)
Unlocking Reasoning Capabilities in LLMs via Reinforcement Learning Exploration
by: Deng, Wenhao, et al.
Published: (2025)
by: Deng, Wenhao, et al.
Published: (2025)
Plan2Cleanse: Test-Time Backdoor Defense via Monte-Carlo Planning in Deep Reinforcement Learning
by: Chen, Sze-Ann, et al.
Published: (2026)
by: Chen, Sze-Ann, et al.
Published: (2026)
Human-Corrected Labels Learning: Enhancing Labels Quality via Human Correction of VLMs Discrepancies
by: Li, Zhongnian, et al.
Published: (2025)
by: Li, Zhongnian, et al.
Published: (2025)
OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2024)
by: Yao, Yihang, et al.
Published: (2024)
Utilizing Deep Learning for Enhancing Network Resilience in Finance
by: Gong, Yulu, et al.
Published: (2024)
by: Gong, Yulu, et al.
Published: (2024)
Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification
by: Qian, Wenhao, et al.
Published: (2025)
by: Qian, Wenhao, et al.
Published: (2025)
Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering
by: Zhang, Nonghai, et al.
Published: (2026)
by: Zhang, Nonghai, et al.
Published: (2026)
Offline Multi-agent Reinforcement Learning via Sequential Score Decomposition
by: Qiao, Dan, et al.
Published: (2025)
by: Qiao, Dan, et al.
Published: (2025)
Toward Inherently Robust VLMs Against Visual Perception Attacks
by: MohajerAnsari, Pedram, et al.
Published: (2025)
by: MohajerAnsari, Pedram, et al.
Published: (2025)
Euclid's Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks
by: Lian, Shijie, et al.
Published: (2025)
by: Lian, Shijie, et al.
Published: (2025)
Enhancing Reinforcement Learning Agents with Local Guides
by: Daoudi, Paul, et al.
Published: (2024)
by: Daoudi, Paul, et al.
Published: (2024)
Distributed Multi-Head Learning Systems for Power Consumption Prediction
by: Syu, Jia-Hao, et al.
Published: (2025)
by: Syu, Jia-Hao, et al.
Published: (2025)
Understanding and Rectifying Safety Perception Distortion in VLMs
by: Zou, Xiaohan, et al.
Published: (2025)
by: Zou, Xiaohan, et al.
Published: (2025)
TopoPerception: A Shortcut-Free Evaluation of Global Visual Perception in Large Vision-Language Models
by: Zhou, Wenhao, et al.
Published: (2025)
by: Zhou, Wenhao, et al.
Published: (2025)
Learning First Integrals via Backward-Generated Data and Guided Reinforcement Learning
by: Zhong, Jingfeng, et al.
Published: (2026)
by: Zhong, Jingfeng, et al.
Published: (2026)
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
by: Liu, Xu-Hui, et al.
Published: (2024)
by: Liu, Xu-Hui, et al.
Published: (2024)
Real-World Reinforcement Learning of Active Perception Behaviors
by: Hu, Edward S., et al.
Published: (2025)
by: Hu, Edward S., et al.
Published: (2025)
Density-Aware Translation of Spurious Correlations in Zero-Shot VLMs
by: Hasanebrahimi, Afsaneh, et al.
Published: (2026)
by: Hasanebrahimi, Afsaneh, et al.
Published: (2026)
Deep Reinforcement Learning-Based User Scheduling for Collaborative Perception
by: Liu, Yandi, et al.
Published: (2025)
by: Liu, Yandi, et al.
Published: (2025)
Initialization Matters: On the Benign Overfitting of Two-Layer ReLU CNN with Fully Trainable Layers
by: Shang, Shuning, et al.
Published: (2024)
by: Shang, Shuning, et al.
Published: (2024)
Apple: Toward General Active Perception via Reinforcement Learning
by: Schneider, Tim, et al.
Published: (2025)
by: Schneider, Tim, et al.
Published: (2025)
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
by: Chao, Chen-Hao, et al.
Published: (2024)
by: Chao, Chen-Hao, et al.
Published: (2024)
MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation
by: Li, Guanghao, et al.
Published: (2025)
by: Li, Guanghao, et al.
Published: (2025)
Thinking with Deltas: Incentivizing Reinforcement Learning via Differential Visual Reasoning Policy
by: Gao, Shujian, et al.
Published: (2026)
by: Gao, Shujian, et al.
Published: (2026)
Solving Continual Offline Reinforcement Learning with Decision Transformer
by: Huang, Kaixin, et al.
Published: (2024)
by: Huang, Kaixin, et al.
Published: (2024)
MENTOR: Guiding Hierarchical Reinforcement Learning with Human Feedback and Dynamic Distance Constraint
by: Zhou, Xinglin, et al.
Published: (2024)
by: Zhou, Xinglin, et al.
Published: (2024)
How Does the Lagrangian Guide Safe Reinforcement Learning through Diffusion Models?
by: Cheng, Xiaoyuan, et al.
Published: (2026)
by: Cheng, Xiaoyuan, et al.
Published: (2026)
Heterogeneous Federated Learning System for Sparse Healthcare Time-Series Prediction
by: Syu, Jia-Hao, et al.
Published: (2025)
by: Syu, Jia-Hao, et al.
Published: (2025)
Structured Document Translation via Format Reinforcement Learning
by: Song, Haiyue, et al.
Published: (2025)
by: Song, Haiyue, et al.
Published: (2025)
Geometric Mixture-of-Experts with Curvature-Guided Adaptive Routing for Graph Representation Learning
by: Cao, Haifang, et al.
Published: (2026)
by: Cao, Haifang, et al.
Published: (2026)
Geometric Manifold Rectification for Imbalanced Learning
by: Wang, Xubin, et al.
Published: (2026)
by: Wang, Xubin, et al.
Published: (2026)
Labeled TrustSet Guided: Batch Active Learning with Reinforcement Learning
by: Cui, Guofeng, et al.
Published: (2026)
by: Cui, Guofeng, et al.
Published: (2026)
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
by: Yang, Yiqin, et al.
Published: (2025)
by: Yang, Yiqin, et al.
Published: (2025)
Mitigating LLM Hallucination via Behaviorally Calibrated Reinforcement Learning
by: Wu, Jiayun, et al.
Published: (2025)
by: Wu, Jiayun, et al.
Published: (2025)
Similar Items
-
VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use
by: Wu, Mingyuan, et al.
Published: (2025) -
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
by: Wang, Xiyao, et al.
Published: (2025) -
Code as Reward: Empowering Reinforcement Learning with VLMs
by: Venuto, David, et al.
Published: (2024) -
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning
by: Liang, Zhenwen, et al.
Published: (2025) -
Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
by: Ghosh, Udita, et al.
Published: (2025)