:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Jiankang, Zhang, Tianke, Liu, Changyi, Ding, Haojie, Shi, Yaya, Cheng, Feng, Xiao, Huihui, Wen, Bin, Yang, Fan, Gao, Tingting, Zhang, Di
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2502.09925
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

InstructEngine: Instruction-driven Text-to-Image Alignment
by: Lu, Xingyu, et al.
Published: (2025)

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
by: Zhang, Yi-Fan, et al.
Published: (2025)

EVLM: An Efficient Vision-Language Model for Visual Understanding
by: Chen, Kaibing, et al.
Published: (2024)

Thyme: Think Beyond Images
by: Zhang, Yi-Fan, et al.
Published: (2025)

Joint Reward Modeling: Internalizing Chain-of-Thought for Efficient Visual Reward Models
by: Yang, Yankai, et al.
Published: (2026)

VLM as Policy: Common-Law Content Moderation Framework for Short Video Platform
by: Lu, Xingyu, et al.
Published: (2025)

Task-Agnostic Pre-training and Task-Guided Fine-tuning for Versatile Diffusion Planner
by: Fan, Chenyou, et al.
Published: (2024)

From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition
by: Wang, Tianduo, et al.
Published: (2025)

SpatialReward: Bridging the Perception Gap in Online RL for Image Editing via Explicit Spatial Reasoning
by: Long, Yancheng, et al.
Published: (2026)

SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores
by: Mei, Zhiyu, et al.
Published: (2023)

Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
by: Li, Changqun, et al.
Published: (2024)

Kwai-STaR: Transform LLMs into State-Transition Reasoners
by: Lu, Xingyu, et al.
Published: (2024)

VideoTemp-o3: Harmonizing Temporal Grounding and Video Understanding in Agentic Thinking-with-Videos
by: Liu, Wenqi, et al.
Published: (2026)

Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning
by: Xu, Zhiyang, et al.
Published: (2024)

Speculative Coreset Selection for Task-Specific Fine-tuning
by: Zhang, Xiaoyu, et al.
Published: (2024)

Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning
by: Hu, Xiao, et al.
Published: (2025)

UniRef-Image-Edit: Towards Scalable and Consistent Multi-Reference Image Editing
by: Wei, Hongyang, et al.
Published: (2026)

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning
by: Lu, Xingyu, et al.
Published: (2026)

Learning a Thousand Tasks in a Day
by: Dreczkowski, Kamil, et al.
Published: (2025)

ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL
by: Lu, Xingyu, et al.
Published: (2026)

Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning
by: Du, Xiaohu, et al.
Published: (2024)

Search to Fine-tune Pre-trained Graph Neural Networks for Graph-level Tasks
by: Wang, Zhili, et al.
Published: (2023)

Sea turtle nesting in the Ten Thousand Islands of Florida
by: Garmestani, Ahjond S., et al.
Published: (1997)

CF-VLM:CounterFactual Vision-Language Fine-tuning
by: Zhang, Jusheng, et al.
Published: (2025)

iMOVE: Instance-Motion-Aware Video Understanding
by: Li, Jiaze, et al.
Published: (2025)

4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
by: Bachmann, Roman, et al.
Published: (2024)

Ten Thousand Journal Articles Later: Ethnography of «The Literature» in Science
by: CHRISTOPHER KELTY
Published: (2009)

A canoe trip in the Ten Thousand Islands to collect Liguus
by: Mcginty, P L
Published: (1936)

Efficient Model Editing with Task-Localized Sparse Fine-tuning
by: Iurada, Leonardo, et al.
Published: (2025)

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition
by: Zhang, Junyuan, et al.
Published: (2025)

LLaMA-Gene: A General-purpose Gene Task Large Language Model Based on Instruction Fine-tuning
by: Liang, Wang
Published: (2024)

Do we Really Need Visual Instructions? Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models
by: Liu, Zikang, et al.
Published: (2025)

AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning
by: Cui, Jiaxi, et al.
Published: (2024)

Fine-tuning Transformer-based Encoder for Turkish Language Understanding Tasks
by: Yildirim, Savas
Published: (2024)

UEMM-Air: Make Unmanned Aerial Vehicles Perform More Multi-modal Tasks
by: Yao, Liang, et al.
Published: (2024)

MotIF: Motion Instruction Fine-tuning
by: Hwang, Minyoung, et al.
Published: (2024)

Research Progress and Application Prospects of Transition Metal Phosphosulfide Heterojunction Catalysts in Electrocatalytic Water Splitting
by: Ningning Liu, et al.
Published: (2025)

FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema
by: Lu, Junru, et al.
Published: (2024)

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization
by: Shen, Yang, et al.
Published: (2024)

Improving Low-Resource Knowledge Tracing Tasks by Supervised Pre-training and Importance Mechanism Fine-tuning
by: Zhang, Hengyuan, et al.
Published: (2024)