Saved in:
| Main Authors: | Fuest, Michael, Hu, Vincent Tao, Ommer, Björn |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.11234 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Diffusion Models and Representation Learning: A Survey
by: Fuest, Michael, et al.
Published: (2024)
by: Fuest, Michael, et al.
Published: (2024)
[MASK] is All You Need
by: Hu, Vincent Tao, et al.
Published: (2024)
by: Hu, Vincent Tao, et al.
Published: (2024)
Boosting Latent Diffusion with Flow Matching
by: Schusterbauer, Johannes, et al.
Published: (2023)
by: Schusterbauer, Johannes, et al.
Published: (2023)
Distillation of Diffusion Features for Semantic Correspondence
by: Fundel, Frank, et al.
Published: (2024)
by: Fundel, Frank, et al.
Published: (2024)
Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment
by: Schusterbauer, Johannes, et al.
Published: (2025)
by: Schusterbauer, Johannes, et al.
Published: (2025)
ActAlign: Zero-Shot Fine-Grained Video Classification via Language-Guided Sequence Alignment
by: Aghdam, Amir, et al.
Published: (2025)
by: Aghdam, Amir, et al.
Published: (2025)
DepthFM: Fast Monocular Depth Estimation with Flow Matching
by: Gui, Ming, et al.
Published: (2024)
by: Gui, Ming, et al.
Published: (2024)
Probabilistic Precipitation Nowcasting with Rectified Flow Transformers
by: Schusterbauer, Johannes, et al.
Published: (2026)
by: Schusterbauer, Johannes, et al.
Published: (2026)
Does VLM Classification Benefit from LLM Description Semantics?
by: Ma, Pingchuan, et al.
Published: (2024)
by: Ma, Pingchuan, et al.
Published: (2024)
CAGE: Unsupervised Visual Composition and Animation for Controllable Video Generation
by: Davtyan, Aram, et al.
Published: (2024)
by: Davtyan, Aram, et al.
Published: (2024)
Learning Long-term Motion Embeddings for Efficient Kinematics Generation
by: Stracke, Nick, et al.
Published: (2026)
by: Stracke, Nick, et al.
Published: (2026)
TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training
by: Krause, Felix, et al.
Published: (2025)
by: Krause, Felix, et al.
Published: (2025)
MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives
by: Ji, Sihui, et al.
Published: (2025)
by: Ji, Sihui, et al.
Published: (2025)
Purrception: Variational Flow Matching for Vector-Quantized Image Generation
by: Matişan, Răzvan-Andrei, et al.
Published: (2025)
by: Matişan, Răzvan-Andrei, et al.
Published: (2025)
Unsupervised View-Invariant Human Posture Representation
by: Sardari, Faegheh, et al.
Published: (2021)
by: Sardari, Faegheh, et al.
Published: (2021)
Guiding Token-Sparse Diffusion Models
by: Krause, Felix, et al.
Published: (2026)
by: Krause, Felix, et al.
Published: (2026)
Scaling Image Tokenizers with Grouped Spherical Quantization
by: Wang, Jiangtao, et al.
Published: (2024)
by: Wang, Jiangtao, et al.
Published: (2024)
Deforming Videos to Masks: Flow Matching for Referring Video Segmentation
by: Wang, Zanyi, et al.
Published: (2025)
by: Wang, Zanyi, et al.
Published: (2025)
SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models
by: Ma, Pingchuan, et al.
Published: (2025)
by: Ma, Pingchuan, et al.
Published: (2025)
Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation
by: Wang, Jiajun, et al.
Published: (2024)
by: Wang, Jiajun, et al.
Published: (2024)
Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
by: Gui, Ming, et al.
Published: (2025)
by: Gui, Ming, et al.
Published: (2025)
FMVP: Masked Flow Matching for Adversarial Video Purification
by: Tang, Duoxun, et al.
Published: (2026)
by: Tang, Duoxun, et al.
Published: (2026)
LumosFlow: Motion-Guided Long Video Generation
by: Chen, Jiahao, et al.
Published: (2025)
by: Chen, Jiahao, et al.
Published: (2025)
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
by: Baumann, Stefan Andreas, et al.
Published: (2024)
by: Baumann, Stefan Andreas, et al.
Published: (2024)
What If : Understanding Motion Through Sparse Interactions
by: Baumann, Stefan Andreas, et al.
Published: (2025)
by: Baumann, Stefan Andreas, et al.
Published: (2025)
Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation
by: Schusterbauer, Johannes, et al.
Published: (2026)
by: Schusterbauer, Johannes, et al.
Published: (2026)
RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video
by: Prestel, Ulrich, et al.
Published: (2026)
by: Prestel, Ulrich, et al.
Published: (2026)
CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models
by: Stracke, Nick, et al.
Published: (2024)
by: Stracke, Nick, et al.
Published: (2024)
WaterFlow: Explicit Physics-Prior Rectified Flow for Underwater Saliency Mask Generation
by: Li, Runting, et al.
Published: (2025)
by: Li, Runting, et al.
Published: (2025)
CacheFlow: Compressive Streaming Memory for Efficient Long-Form Video Understanding
by: Patel, Shrenik, et al.
Published: (2025)
by: Patel, Shrenik, et al.
Published: (2025)
FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation
by: Zhang, Shilong, et al.
Published: (2025)
by: Zhang, Shilong, et al.
Published: (2025)
ZigMa: A DiT-style Zigzag Mamba Diffusion Model
by: Hu, Vincent Tao, et al.
Published: (2024)
by: Hu, Vincent Tao, et al.
Published: (2024)
Restora-Flow: Mask-Guided Image Restoration with Flow Matching
by: Hadzic, Arnela, et al.
Published: (2025)
by: Hadzic, Arnela, et al.
Published: (2025)
FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching
by: Park, Jangho, et al.
Published: (2026)
by: Park, Jangho, et al.
Published: (2026)
FlowCut: Unsupervised Video Instance Segmentation via Temporal Mask Matching
by: Sari, Alp Eren, et al.
Published: (2025)
by: Sari, Alp Eren, et al.
Published: (2025)
Efficient Continuous Video Flow Model for Video Prediction
by: Shrivastava, Gaurav, et al.
Published: (2024)
by: Shrivastava, Gaurav, et al.
Published: (2024)
CleanDIFT: Diffusion Features without Noise
by: Stracke, Nick, et al.
Published: (2024)
by: Stracke, Nick, et al.
Published: (2024)
Pyramidal Flow Matching for Efficient Video Generative Modeling
by: Jin, Yang, et al.
Published: (2024)
by: Jin, Yang, et al.
Published: (2024)
AdaFlow: Efficient Long Video Editing via Adaptive Attention Slimming And Keyframe Selection
by: Zhang, Shuheng, et al.
Published: (2025)
by: Zhang, Shuheng, et al.
Published: (2025)
Compositional Video Generation as Flow Equalization
by: Yang, Xingyi, et al.
Published: (2024)
by: Yang, Xingyi, et al.
Published: (2024)
Similar Items
-
Diffusion Models and Representation Learning: A Survey
by: Fuest, Michael, et al.
Published: (2024) -
[MASK] is All You Need
by: Hu, Vincent Tao, et al.
Published: (2024) -
Boosting Latent Diffusion with Flow Matching
by: Schusterbauer, Johannes, et al.
Published: (2023) -
Distillation of Diffusion Features for Semantic Correspondence
by: Fundel, Frank, et al.
Published: (2024) -
Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment
by: Schusterbauer, Johannes, et al.
Published: (2025)