Saved in:
| Main Authors: | Ma, Yeyao, Li, Chen, Zhang, Xiaosong, Hu, Han, Xie, Weidi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.12155 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again
by: Geng, Zigang, et al.
Published: (2025)
by: Geng, Zigang, et al.
Published: (2025)
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
by: Zhang, Zheng, et al.
Published: (2024)
by: Zhang, Zheng, et al.
Published: (2024)
MatchTime: Towards Automatic Soccer Game Commentary Generation
by: Rao, Jiayuan, et al.
Published: (2024)
by: Rao, Jiayuan, et al.
Published: (2024)
Moving Object Segmentation: All You Need Is SAM (and Flow)
by: Xie, Junyu, et al.
Published: (2024)
by: Xie, Junyu, et al.
Published: (2024)
MedFlowSeg: Flow Matching for Medical Image Segmentation with Frequency-Aware Attention
by: Chen, Zhi, et al.
Published: (2026)
by: Chen, Zhi, et al.
Published: (2026)
SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass
by: Meng, Yanxu, et al.
Published: (2025)
by: Meng, Yanxu, et al.
Published: (2025)
MT-EditFlow: Reinforcement Learning for Multi-Turn Image Editing with Flow Matching
by: Huang, Jiahui, et al.
Published: (2026)
by: Huang, Jiahui, et al.
Published: (2026)
Beyond Imitation: Constraint-Aware Trajectory Generation with Flow Matching For End-to-End Autonomous Driving
by: Liu, Lin, et al.
Published: (2025)
by: Liu, Lin, et al.
Published: (2025)
ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval
by: Zhan, Guanqi, et al.
Published: (2025)
by: Zhan, Guanqi, et al.
Published: (2025)
GMOS: Grounding Moving Object Segmentation in 3D Space and Time
by: Xie, Junyu, et al.
Published: (2026)
by: Xie, Junyu, et al.
Published: (2026)
FMVP: Masked Flow Matching for Adversarial Video Purification
by: Tang, Duoxun, et al.
Published: (2026)
by: Tang, Duoxun, et al.
Published: (2026)
Revisiting Multi-Task Visual Representation Learning
by: Di, Shangzhe, et al.
Published: (2026)
by: Di, Shangzhe, et al.
Published: (2026)
Grounded Question-Answering in Long Egocentric Videos
by: Di, Shangzhe, et al.
Published: (2023)
by: Di, Shangzhe, et al.
Published: (2023)
EchoSight: Advancing Visual-Language Models with Wiki Knowledge
by: Yan, Yibin, et al.
Published: (2024)
by: Yan, Yibin, et al.
Published: (2024)
A Sanity Check on Composed Image Retrieval
by: Liu, Yikun, et al.
Published: (2026)
by: Liu, Yikun, et al.
Published: (2026)
Multi-Sentence Grounding for Long-term Instructional Video
by: Li, Zeqian, et al.
Published: (2023)
by: Li, Zeqian, et al.
Published: (2023)
Aligning Latent Geometry for Spherical Flow Matching in Image Generation
by: Meral, Tuna Han Salih, et al.
Published: (2026)
by: Meral, Tuna Han Salih, et al.
Published: (2026)
Zero-shot Composed Text-Image Retrieval
by: Liu, Yikun, et al.
Published: (2023)
by: Liu, Yikun, et al.
Published: (2023)
Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
by: Chen, Qirui, et al.
Published: (2024)
by: Chen, Qirui, et al.
Published: (2024)
FlowLUT: Efficient Image Enhancement via Differentiable LUTs and Iterative Flow Matching
by: Hu, Liubing, et al.
Published: (2025)
by: Hu, Liubing, et al.
Published: (2025)
FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching
by: Yi, Junchao, et al.
Published: (2026)
by: Yi, Junchao, et al.
Published: (2026)
A Sanity Check for AI-generated Image Detection
by: Yan, Shilin, et al.
Published: (2024)
by: Yan, Shilin, et al.
Published: (2024)
FREPix: Frequency-Heterogeneous Flow Matching for Pixel-Space Image Generation
by: Lin, Mingfeng, et al.
Published: (2026)
by: Lin, Mingfeng, et al.
Published: (2026)
DenseStep2M: A Scalable, Training-Free Pipeline for Dense Instructional Video Annotation
by: Ge, Mingji, et al.
Published: (2026)
by: Ge, Mingji, et al.
Published: (2026)
Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation
by: Chen, Hao, et al.
Published: (2025)
by: Chen, Hao, et al.
Published: (2025)
WaterMamba: Visual State Space Model for Underwater Image Enhancement
by: Guan, Meisheng, et al.
Published: (2024)
by: Guan, Meisheng, et al.
Published: (2024)
Appearance-Based Refinement for Object-Centric Motion Segmentation
by: Xie, Junyu, et al.
Published: (2023)
by: Xie, Junyu, et al.
Published: (2023)
Kernel Adversarial Learning for Real-world Image Super-resolution
by: Wang, Hu, et al.
Published: (2021)
by: Wang, Hu, et al.
Published: (2021)
Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis
by: Lu, Yanzuo, et al.
Published: (2025)
by: Lu, Yanzuo, et al.
Published: (2025)
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching
by: Ren, Sucheng, et al.
Published: (2024)
by: Ren, Sucheng, et al.
Published: (2024)
CurveFlow: Curvature-Guided Flow Matching for Image Generation
by: Luo, Yan, et al.
Published: (2025)
by: Luo, Yan, et al.
Published: (2025)
Character-Centric Understanding of Animated Movies
by: Gui, Zhongrui, et al.
Published: (2025)
by: Gui, Zhongrui, et al.
Published: (2025)
Aerial Monocular 3D Object Detection
by: Hu, Yue, et al.
Published: (2022)
by: Hu, Yue, et al.
Published: (2022)
Flow of Truth: Proactive Temporal Forensics for Image-to-Video Generation
by: Chen, Yuzhuo, et al.
Published: (2026)
by: Chen, Yuzhuo, et al.
Published: (2026)
Frequency-Aware Flow Matching for High-Quality Image Generation
by: Ren, Sucheng, et al.
Published: (2026)
by: Ren, Sucheng, et al.
Published: (2026)
A General Protocol to Probe Large Vision Models for 3D Physical Understanding
by: Zhan, Guanqi, et al.
Published: (2023)
by: Zhan, Guanqi, et al.
Published: (2023)
Few-Shot Distribution-Aligned Flow Matching for Data Synthesis in Medical Image Segmentation
by: Yang, Jie, et al.
Published: (2026)
by: Yang, Jie, et al.
Published: (2026)
Learning Straight Flows: Variational Flow Matching for Efficient Generation
by: Ma, Chenrui, et al.
Published: (2025)
by: Ma, Chenrui, et al.
Published: (2025)
Fine-grained Spatiotemporal Grounding on Egocentric Videos
by: Liang, Shuo, et al.
Published: (2025)
by: Liang, Shuo, et al.
Published: (2025)
Can We Build Scene Graphs, Not Classify Them? FlowSG: Progressive Image-Conditioned Scene Graph Generation with Flow Matching
by: Hu, Xin, et al.
Published: (2026)
by: Hu, Xin, et al.
Published: (2026)
Similar Items
-
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again
by: Geng, Zigang, et al.
Published: (2025) -
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model
by: Zhang, Zheng, et al.
Published: (2024) -
MatchTime: Towards Automatic Soccer Game Commentary Generation
by: Rao, Jiayuan, et al.
Published: (2024) -
Moving Object Segmentation: All You Need Is SAM (and Flow)
by: Xie, Junyu, et al.
Published: (2024) -
MedFlowSeg: Flow Matching for Medical Image Segmentation with Frequency-Aware Attention
by: Chen, Zhi, et al.
Published: (2026)