Saved in:
| Main Authors: | Wang, Yunhao, Li, Ziting, Chen, Shuai, Liu, Tao, Song, Chao, Jiang, Junjie, Zhu, Jian, Gao, Peng, Qin, Bin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.00690 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
by: Agnihotri, Akhil, et al.
Published: (2023)
by: Agnihotri, Akhil, et al.
Published: (2023)
ACPO: Counteracting Likelihood Displacement in Vision-Language Alignment with Asymmetric Constraints
by: Huang, Kaili, et al.
Published: (2026)
by: Huang, Kaili, et al.
Published: (2026)
MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models
by: Xia, Yinan, et al.
Published: (2025)
by: Xia, Yinan, et al.
Published: (2025)
ACPO: AI-Enabled Compiler Framework
by: Ashouri, Amir H., et al.
Published: (2023)
by: Ashouri, Amir H., et al.
Published: (2023)
PatchCue: Enhancing Vision-Language Model Reasoning with Patch-Based Visual Cues
by: Qi, Yukun, et al.
Published: (2026)
by: Qi, Yukun, et al.
Published: (2026)
AdaptInfer: Adaptive Token Pruning for Vision-Language Model Inference with Dynamical Text Guidance
by: Zhang, Weichen, et al.
Published: (2025)
by: Zhang, Weichen, et al.
Published: (2025)
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance
by: Yang, Yang, et al.
Published: (2026)
by: Yang, Yang, et al.
Published: (2026)
Prompting Large Vision-Language Models for Compositional Reasoning
by: Ossowski, Timothy, et al.
Published: (2024)
by: Ossowski, Timothy, et al.
Published: (2024)
Segment-Aligned Policy Optimization for Multi-Modal Reasoning
by: Gao, Lei, et al.
Published: (2026)
by: Gao, Lei, et al.
Published: (2026)
PRPO: Aligning Process Reward with Outcome Reward in Policy Optimization
by: Ding, Ruiyi, et al.
Published: (2026)
by: Ding, Ruiyi, et al.
Published: (2026)
MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization
by: Zhu, Kangyu, et al.
Published: (2024)
by: Zhu, Kangyu, et al.
Published: (2024)
StreamingVLA: Streaming Vision-Language-Action Model with Action Flow Matching and Adaptive Early Observation
by: Shi, Yiran, et al.
Published: (2026)
by: Shi, Yiran, et al.
Published: (2026)
ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization
by: Zhan, Yang, et al.
Published: (2026)
by: Zhan, Yang, et al.
Published: (2026)
WMPO: World Model-based Policy Optimization for Vision-Language-Action Models
by: Zhu, Fangqi, et al.
Published: (2025)
by: Zhu, Fangqi, et al.
Published: (2025)
Harnessing Large Vision and Language Models in Agriculture: A Review
by: Zhu, Hongyan, et al.
Published: (2024)
by: Zhu, Hongyan, et al.
Published: (2024)
Mitigating the Reasoning Tax in Vision-Language Fine-Tuning with Input-Adaptive Depth Aggregation
by: Ren, Yiming, et al.
Published: (2026)
by: Ren, Yiming, et al.
Published: (2026)
SAPO: Step-Aligned Policy Optimization for Reasoning-Based Generative Recommendation
by: Zheng, Zaiyi, et al.
Published: (2026)
by: Zheng, Zaiyi, et al.
Published: (2026)
VLA-Thinker: Boosting Vision-Language-Action Models through Thinking-with-Image Reasoning
by: Wang, Chaoyang, et al.
Published: (2026)
by: Wang, Chaoyang, et al.
Published: (2026)
Dynamic Rank Adaptation for Vision-Language Models
by: Wang, Jiahui, et al.
Published: (2025)
by: Wang, Jiahui, et al.
Published: (2025)
Diversity-Aware Policy Optimization for Large Language Model Reasoning
by: Yao, Jian, et al.
Published: (2025)
by: Yao, Jian, et al.
Published: (2025)
PathReasoner-R1: Instilling Structured Reasoning into Pathology Vision-Language Model via Knowledge-Guided Policy Optimization
by: Jiang, Songhan, et al.
Published: (2026)
by: Jiang, Songhan, et al.
Published: (2026)
Hierarchical Budget Policy Optimization for Adaptive Reasoning
by: Lyu, Shangke, et al.
Published: (2025)
by: Lyu, Shangke, et al.
Published: (2025)
Soft Adaptive Policy Optimization
by: Gao, Chang, et al.
Published: (2025)
by: Gao, Chang, et al.
Published: (2025)
Integrated Structural Prompt Learning for Vision-Language Models
by: Wang, Jiahui, et al.
Published: (2025)
by: Wang, Jiahui, et al.
Published: (2025)
Boosting the Generalization and Reasoning of Vision Language Models with Curriculum Reinforcement Learning
by: Deng, Huilin, et al.
Published: (2025)
by: Deng, Huilin, et al.
Published: (2025)
CLPO: Curriculum Learning meets Policy Optimization for LLM Reasoning
by: Zhang, Shijie, et al.
Published: (2025)
by: Zhang, Shijie, et al.
Published: (2025)
Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language Models
by: Li, Bin, et al.
Published: (2025)
by: Li, Bin, et al.
Published: (2025)
Adaptive Simulation Experiment for LLM Policy Optimization
by: Hu, Mingjie, et al.
Published: (2026)
by: Hu, Mingjie, et al.
Published: (2026)
InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning
by: Zhang, Ji, et al.
Published: (2025)
by: Zhang, Ji, et al.
Published: (2025)
UrbanLLM: Autonomous Urban Activity Planning and Management with Large Language Models
by: Jiang, Yue, et al.
Published: (2024)
by: Jiang, Yue, et al.
Published: (2024)
Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization
by: Lyu, Xinyu, et al.
Published: (2024)
by: Lyu, Xinyu, et al.
Published: (2024)
Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition
by: Li, Yu, et al.
Published: (2025)
by: Li, Yu, et al.
Published: (2025)
RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models
by: Kim, Dongyoung, et al.
Published: (2026)
by: Kim, Dongyoung, et al.
Published: (2026)
GSPR: Aligning LLM Safeguards as Generalizable Safety Policy Reasoners
by: Li, Haoran, et al.
Published: (2025)
by: Li, Haoran, et al.
Published: (2025)
GAgent: An Adaptive Rigid-Soft Gripping Agent with Vision Language Models for Complex Lighting Environments
by: Li, Zhuowei, et al.
Published: (2024)
by: Li, Zhuowei, et al.
Published: (2024)
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models
by: Jin, Ruihan, et al.
Published: (2024)
by: Jin, Ruihan, et al.
Published: (2024)
Hybrid Latent Reasoning with Decoupled Policy Optimization
by: Cheng, Tao, et al.
Published: (2026)
by: Cheng, Tao, et al.
Published: (2026)
Voila-A: Aligning Vision-Language Models with User's Gaze Attention
by: Yan, Kun, et al.
Published: (2023)
by: Yan, Kun, et al.
Published: (2023)
Aligning Requirement for Large Language Model's Code Generation
by: Tian, Zhao, et al.
Published: (2025)
by: Tian, Zhao, et al.
Published: (2025)
Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning
by: Liu, Junming, et al.
Published: (2025)
by: Liu, Junming, et al.
Published: (2025)
Similar Items
-
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
by: Agnihotri, Akhil, et al.
Published: (2023) -
ACPO: Counteracting Likelihood Displacement in Vision-Language Alignment with Asymmetric Constraints
by: Huang, Kaili, et al.
Published: (2026) -
MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models
by: Xia, Yinan, et al.
Published: (2025) -
ACPO: AI-Enabled Compiler Framework
by: Ashouri, Amir H., et al.
Published: (2023) -
PatchCue: Enhancing Vision-Language Model Reasoning with Patch-Based Visual Cues
by: Qi, Yukun, et al.
Published: (2026)