Saved in:
| Main Authors: | Guan, Jiajin, Mei, Haibo, Zhang, Bonan, Liu, Dan, Fu, Yuanshuang, Zhang, Yue |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.11196 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
GRPO-TTA: Test-Time Visual Tuning for Vision-Language Models via GRPO-Driven Reinforcement Learning
by: Li, Yujun, et al.
Published: (2026)
by: Li, Yujun, et al.
Published: (2026)
R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO
by: Yao, Huanjin, et al.
Published: (2025)
by: Yao, Huanjin, et al.
Published: (2025)
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision
by: Wei, Zhixiang, et al.
Published: (2026)
by: Wei, Zhixiang, et al.
Published: (2026)
Safe Path Planning and Observation Quality Enhancement Strategy for Unmanned Aerial Vehicles in Water Quality Monitoring Tasks
by: Fu, Yuanshuang, et al.
Published: (2025)
by: Fu, Yuanshuang, et al.
Published: (2025)
VL-UniTrack: A Unified Framework with Visual-Language Prompts for UAV-Ground Visual Tracking
by: Xu, Boyue, et al.
Published: (2026)
by: Xu, Boyue, et al.
Published: (2026)
UAV-CodeAgents: Scalable UAV Mission Planning via Multi-Agent ReAct and Vision-Language Reasoning
by: Sautenkov, Oleg, et al.
Published: (2025)
by: Sautenkov, Oleg, et al.
Published: (2025)
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning
by: Fu, Yuqian, et al.
Published: (2025)
by: Fu, Yuqian, et al.
Published: (2025)
GRPO-RM: Fine-Tuning Representation Models via GRPO-Driven Reinforcement Learning
by: Xu, Yanchen, et al.
Published: (2025)
by: Xu, Yanchen, et al.
Published: (2025)
Can Vision-Language Models Think from the Sky? Unifying UAV Reasoning and Generation
by: Sun, Jintao, et al.
Published: (2026)
by: Sun, Jintao, et al.
Published: (2026)
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models
by: Tan, Huajie, et al.
Published: (2025)
by: Tan, Huajie, et al.
Published: (2025)
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
by: Varma, Maya, et al.
Published: (2024)
by: Varma, Maya, et al.
Published: (2024)
UAUTrack: Towards Unified Multimodal Anti-UAV Visual Tracking
by: Ren, Qionglin, et al.
Published: (2025)
by: Ren, Qionglin, et al.
Published: (2025)
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation
by: Liu, Youzhi, et al.
Published: (2024)
by: Liu, Youzhi, et al.
Published: (2024)
Rethinking Fine-Tuning: Unlocking Hidden Capabilities in Vision-Language Models
by: Zhang, Mingyuan, et al.
Published: (2025)
by: Zhang, Mingyuan, et al.
Published: (2025)
Leveraging Large Vision Model for Multi-UAV Co-perception in Low-Altitude Wireless Networks
by: Xu, Yunting, et al.
Published: (2026)
by: Xu, Yunting, et al.
Published: (2026)
Top-Down Compression: Revisit Efficient Vision Token Projection for Visual Instruction Tuning
by: li, Bonan, et al.
Published: (2025)
by: li, Bonan, et al.
Published: (2025)
UAV-OVO: Out-of-Viewpoint Generalization in UAV Action Recognition
by: Xia, Yu, et al.
Published: (2026)
by: Xia, Yu, et al.
Published: (2026)
DVGBench: Implicit-to-Explicit Visual Grounding Benchmark in UAV Imagery with Large Vision-Language Models
by: Zhou, Yue, et al.
Published: (2026)
by: Zhou, Yue, et al.
Published: (2026)
PestVL-Net: Enabling Multimodal Pest Learning via Fine-grained Vision-Language Interaction
by: Li, Xueheng, et al.
Published: (2026)
by: Li, Xueheng, et al.
Published: (2026)
First‐Principle Prediction of Hydrogen Diffusion Path in Titanium Alloy Passivation Film
by: Yuanshuang Liu, et al.
Published: (2025)
by: Yuanshuang Liu, et al.
Published: (2025)
VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning
by: Qi, Zhangyang, et al.
Published: (2025)
by: Qi, Zhangyang, et al.
Published: (2025)
IndoorUAV: Benchmarking Vision-Language UAV Navigation in Continuous Indoor Environments
by: Liu, Xu, et al.
Published: (2025)
by: Liu, Xu, et al.
Published: (2025)
Fine-Tuning Vision-Language Models for Visual Navigation Assistance
by: Li, Xiao, et al.
Published: (2025)
by: Li, Xiao, et al.
Published: (2025)
EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning
by: Cai, Xinyan, et al.
Published: (2025)
by: Cai, Xinyan, et al.
Published: (2025)
ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for Large Vision-and-Language Models
by: Liu, Yuqi, et al.
Published: (2025)
by: Liu, Yuqi, et al.
Published: (2025)
Parameter-Efficient Fine-Tuning of Vision Foundation Model for Forest Floor Segmentation from UAV Imagery
by: Wasil, Mohammad, et al.
Published: (2025)
by: Wasil, Mohammad, et al.
Published: (2025)
AutoFly: Vision-Language-Action Model for UAV Autonomous Navigation in the Wild
by: Sun, Xiaolou, et al.
Published: (2026)
by: Sun, Xiaolou, et al.
Published: (2026)
15,500 Seconds: Lean UAV Classification Using EfficientNet and Lightweight Fine-Tuning
by: Berg, Andrew P., et al.
Published: (2025)
by: Berg, Andrew P., et al.
Published: (2025)
UAV-VLRR: Vision-Language Informed NMPC for Rapid Response in UAV Search and Rescue
by: Yaqoot, Yasheerah, et al.
Published: (2025)
by: Yaqoot, Yasheerah, et al.
Published: (2025)
A Learning Framework For Cooperative Collision Avoidance of UAV Swarms Leveraging Domain Knowledge
by: Huang, Shuangyao, et al.
Published: (2025)
by: Huang, Shuangyao, et al.
Published: (2025)
CollabOD: Collaborative Multi-Backbone with Cross-scale Vision for UAV Small Object Detection
by: Bai, Xuecheng, et al.
Published: (2026)
by: Bai, Xuecheng, et al.
Published: (2026)
RPO: Fine-Tuning Visual Generative Models via Rich Vision-Language Preferences
by: Zhao, Hanyang, et al.
Published: (2025)
by: Zhao, Hanyang, et al.
Published: (2025)
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
by: Park, Jinyoung, et al.
Published: (2025)
by: Park, Jinyoung, et al.
Published: (2025)
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
by: Liu, Yue, et al.
Published: (2025)
by: Liu, Yue, et al.
Published: (2025)
ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning
by: Hao, Zhiwei, et al.
Published: (2024)
by: Hao, Zhiwei, et al.
Published: (2024)
AeroDuo: Aerial Duo for UAV-based Vision and Language Navigation
by: Wu, Ruipu, et al.
Published: (2025)
by: Wu, Ruipu, et al.
Published: (2025)
DanceGRPO: Unleashing GRPO on Visual Generation
by: Xue, Zeyue, et al.
Published: (2025)
by: Xue, Zeyue, et al.
Published: (2025)
Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation
by: Jia, Jidong, et al.
Published: (2026)
by: Jia, Jidong, et al.
Published: (2026)
Multi-Task Bayesian Optimization for Tuning Decentralized Trajectory Generation in Multi-UAV Systems
by: Manzoni, Marta, et al.
Published: (2025)
by: Manzoni, Marta, et al.
Published: (2025)
VL-Nav: A Neuro-Symbolic Approach for Reasoning-based Vision-Language Navigation
by: Du, Yi, et al.
Published: (2025)
by: Du, Yi, et al.
Published: (2025)
Similar Items
-
GRPO-TTA: Test-Time Visual Tuning for Vision-Language Models via GRPO-Driven Reinforcement Learning
by: Li, Yujun, et al.
Published: (2026) -
R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO
by: Yao, Huanjin, et al.
Published: (2025) -
Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision
by: Wei, Zhixiang, et al.
Published: (2026) -
Safe Path Planning and Observation Quality Enhancement Strategy for Unmanned Aerial Vehicles in Water Quality Monitoring Tasks
by: Fu, Yuanshuang, et al.
Published: (2025) -
VL-UniTrack: A Unified Framework with Visual-Language Prompts for UAV-Ground Visual Tracking
by: Xu, Boyue, et al.
Published: (2026)