:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ma, Yeyao, Li, Chen, Zhang, Xiaosong, Hu, Han, Xie, Weidi
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.12155
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again
by: Geng, Zigang, et al.
Published: (2025)

PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
by: Zhang, Zheng, et al.
Published: (2024)

MatchTime: Towards Automatic Soccer Game Commentary Generation
by: Rao, Jiayuan, et al.
Published: (2024)

Moving Object Segmentation: All You Need Is SAM (and Flow)
by: Xie, Junyu, et al.
Published: (2024)

MedFlowSeg: Flow Matching for Medical Image Segmentation with Frequency-Aware Attention
by: Chen, Zhi, et al.
Published: (2026)

SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass
by: Meng, Yanxu, et al.
Published: (2025)

MT-EditFlow: Reinforcement Learning for Multi-Turn Image Editing with Flow Matching
by: Huang, Jiahui, et al.
Published: (2026)

Beyond Imitation: Constraint-Aware Trajectory Generation with Flow Matching For End-to-End Autonomous Driving
by: Liu, Lin, et al.
Published: (2025)

ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
by: Zhan, Guanqi, et al.
Published: (2025)

GMOS: Grounding Moving Object Segmentation in 3D Space and Time
by: Xie, Junyu, et al.
Published: (2026)

FMVP: Masked Flow Matching for Adversarial Video Purification
by: Tang, Duoxun, et al.
Published: (2026)

Revisiting Multi-Task Visual Representation Learning
by: Di, Shangzhe, et al.
Published: (2026)

Grounded Question-Answering in Long Egocentric Videos
by: Di, Shangzhe, et al.
Published: (2023)

EchoSight: Advancing Visual-Language Models with Wiki Knowledge
by: Yan, Yibin, et al.
Published: (2024)

A Sanity Check on Composed Image Retrieval
by: Liu, Yikun, et al.
Published: (2026)

Multi-Sentence Grounding for Long-term Instructional Video
by: Li, Zeqian, et al.
Published: (2023)

Aligning Latent Geometry for Spherical Flow Matching in Image Generation
by: Meral, Tuna Han Salih, et al.
Published: (2026)

Zero-shot Composed Text-Image Retrieval
by: Liu, Yikun, et al.
Published: (2023)

Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
by: Chen, Qirui, et al.
Published: (2024)

FlowLUT: Efficient Image Enhancement via Differentiable LUTs and Iterative Flow Matching
by: Hu, Liubing, et al.
Published: (2025)

FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching
by: Yi, Junchao, et al.
Published: (2026)

A Sanity Check for AI-generated Image Detection
by: Yan, Shilin, et al.
Published: (2024)

FREPix: Frequency-Heterogeneous Flow Matching for Pixel-Space Image Generation
by: Lin, Mingfeng, et al.
Published: (2026)

DenseStep2M: A Scalable, Training-Free Pipeline for Dense Instructional Video Annotation
by: Ge, Mingji, et al.
Published: (2026)

Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation
by: Chen, Hao, et al.
Published: (2025)

WaterMamba: Visual State Space Model for Underwater Image Enhancement
by: Guan, Meisheng, et al.
Published: (2024)

Appearance-Based Refinement for Object-Centric Motion Segmentation
by: Xie, Junyu, et al.
Published: (2023)

Kernel Adversarial Learning for Real-world Image Super-resolution
by: Wang, Hu, et al.
Published: (2021)

Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis
by: Lu, Yanzuo, et al.
Published: (2025)

FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching
by: Ren, Sucheng, et al.
Published: (2024)

CurveFlow: Curvature-Guided Flow Matching for Image Generation
by: Luo, Yan, et al.
Published: (2025)

Character-Centric Understanding of Animated Movies
by: Gui, Zhongrui, et al.
Published: (2025)

Aerial Monocular 3D Object Detection
by: Hu, Yue, et al.
Published: (2022)

Flow of Truth: Proactive Temporal Forensics for Image-to-Video Generation
by: Chen, Yuzhuo, et al.
Published: (2026)

Frequency-Aware Flow Matching for High-Quality Image Generation
by: Ren, Sucheng, et al.
Published: (2026)

A General Protocol to Probe Large Vision Models for 3D Physical Understanding
by: Zhan, Guanqi, et al.
Published: (2023)

Few-Shot Distribution-Aligned Flow Matching for Data Synthesis in Medical Image Segmentation
by: Yang, Jie, et al.
Published: (2026)

Learning Straight Flows: Variational Flow Matching for Efficient Generation
by: Ma, Chenrui, et al.
Published: (2025)

Fine-grained Spatiotemporal Grounding on Egocentric Videos
by: Liang, Shuo, et al.
Published: (2025)

Can We Build Scene Graphs, Not Classify Them? FlowSG: Progressive Image-Conditioned Scene Graph Generation with Flow Matching
by: Hu, Xin, et al.
Published: (2026)