:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yu, Hao, Jia, Shuning, Li, Guanghao, Jiang, Wenhao, Yuan, Chun
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.22703
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use
by: Wu, Mingyuan, et al.
Published: (2025)

ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
by: Wang, Xiyao, et al.
Published: (2025)

Code as Reward: Empowering Reinforcement Learning with VLMs
by: Venuto, David, et al.
Published: (2024)

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning
by: Liang, Zhenwen, et al.
Published: (2025)

Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
by: Ghosh, Udita, et al.
Published: (2025)

Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning
by: Shi, Chengshuai, et al.
Published: (2026)

Unlocking Reasoning Capabilities in LLMs via Reinforcement Learning Exploration
by: Deng, Wenhao, et al.
Published: (2025)

Plan2Cleanse: Test-Time Backdoor Defense via Monte-Carlo Planning in Deep Reinforcement Learning
by: Chen, Sze-Ann, et al.
Published: (2026)

Human-Corrected Labels Learning: Enhancing Labels Quality via Human Correction of VLMs Discrepancies
by: Li, Zhongnian, et al.
Published: (2025)

OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2024)

Utilizing Deep Learning for Enhancing Network Resilience in Finance
by: Gong, Yulu, et al.
Published: (2024)

Concept Drift Guided LayerNorm Tuning for Efficient Multimodal Metaphor Identification
by: Qian, Wenhao, et al.
Published: (2025)

Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering
by: Zhang, Nonghai, et al.
Published: (2026)

Offline Multi-agent Reinforcement Learning via Sequential Score Decomposition
by: Qiao, Dan, et al.
Published: (2025)

Toward Inherently Robust VLMs Against Visual Perception Attacks
by: MohajerAnsari, Pedram, et al.
Published: (2025)

Euclid's Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks
by: Lian, Shijie, et al.
Published: (2025)

Enhancing Reinforcement Learning Agents with Local Guides
by: Daoudi, Paul, et al.
Published: (2024)

Distributed Multi-Head Learning Systems for Power Consumption Prediction
by: Syu, Jia-Hao, et al.
Published: (2025)

Understanding and Rectifying Safety Perception Distortion in VLMs
by: Zou, Xiaohan, et al.
Published: (2025)

TopoPerception: A Shortcut-Free Evaluation of Global Visual Perception in Large Vision-Language Models
by: Zhou, Wenhao, et al.
Published: (2025)

Learning First Integrals via Backward-Generated Data and Guided Reinforcement Learning
by: Zhong, Jingfeng, et al.
Published: (2026)

Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
by: Liu, Xu-Hui, et al.
Published: (2024)

Real-World Reinforcement Learning of Active Perception Behaviors
by: Hu, Edward S., et al.
Published: (2025)

Density-Aware Translation of Spurious Correlations in Zero-Shot VLMs
by: Hasanebrahimi, Afsaneh, et al.
Published: (2026)

Deep Reinforcement Learning-Based User Scheduling for Collaborative Perception
by: Liu, Yandi, et al.
Published: (2025)

Initialization Matters: On the Benign Overfitting of Two-Layer ReLU CNN with Fully Trainable Layers
by: Shang, Shuning, et al.
Published: (2024)

Apple: Toward General Active Perception via Reinforcement Learning
by: Schneider, Tim, et al.
Published: (2025)

Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
by: Chao, Chen-Hao, et al.
Published: (2024)

MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation
by: Li, Guanghao, et al.
Published: (2025)

Thinking with Deltas: Incentivizing Reinforcement Learning via Differential Visual Reasoning Policy
by: Gao, Shujian, et al.
Published: (2026)

Solving Continual Offline Reinforcement Learning with Decision Transformer
by: Huang, Kaixin, et al.
Published: (2024)

MENTOR: Guiding Hierarchical Reinforcement Learning with Human Feedback and Dynamic Distance Constraint
by: Zhou, Xinglin, et al.
Published: (2024)

How Does the Lagrangian Guide Safe Reinforcement Learning through Diffusion Models?
by: Cheng, Xiaoyuan, et al.
Published: (2026)

Heterogeneous Federated Learning System for Sparse Healthcare Time-Series Prediction
by: Syu, Jia-Hao, et al.
Published: (2025)

Structured Document Translation via Format Reinforcement Learning
by: Song, Haiyue, et al.
Published: (2025)

Geometric Mixture-of-Experts with Curvature-Guided Adaptive Routing for Graph Representation Learning
by: Cao, Haifang, et al.
Published: (2026)

Geometric Manifold Rectification for Imbalanced Learning
by: Wang, Xubin, et al.
Published: (2026)

Labeled TrustSet Guided: Batch Active Learning with Reinforcement Learning
by: Cui, Guofeng, et al.
Published: (2026)

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
by: Yang, Yiqin, et al.
Published: (2025)

Mitigating LLM Hallucination via Behaviorally Calibrated Reinforcement Learning
by: Wu, Jiayun, et al.
Published: (2025)