:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yatim, Danah, Fridman, Rafail, Bar-Tal, Omer, Dekel, Tali
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2502.03621
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Versatile Editing of Video Content, Actions, and Dynamics without Training
by: Kulikov, Vladimir, et al.
Published: (2026)

EasyVFX: Frequency-Driven Decoupling for Resource-Efficient VFX Generation
by: Ma, Yue, et al.
Published: (2026)

Eye2Eye: A Simple Approach for Monocular-to-Stereo Video Synthesis
by: Geyer, Michal, et al.
Published: (2025)

DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video
by: Tumanyan, Narek, et al.
Published: (2024)

Lumiere: A Space-Time Diffusion Model for Video Generation
by: Bar-Tal, Omer, et al.
Published: (2024)

Match-and-Fuse: Consistent Generation from Unstructured Image Sets
by: Feingold, Kate, et al.
Published: (2025)

VideoSketcher: Video Models Prior Enable Versatile Sequential Sketch Generation
by: Ren, Hui, et al.
Published: (2026)

Still-Moving: Customized Video Generation without Customized Video Data
by: Chefer, Hila, et al.
Published: (2024)

AutoVFX: Physically Realistic Video Editing from Natural Language Instructions
by: Hsu, Hao-Yu, et al.
Published: (2024)

What's in the Image? A Deep-Dive into the Vision of Vision Language Models
by: Kaduri, Omri, et al.
Published: (2024)

Real-Time Deepfake Detection in the Real-World
by: Cavia, Bar, et al.
Published: (2024)

InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
by: Akkerman, Rick, et al.
Published: (2024)

Generative Omnimatte: Learning to Decompose Video into Layers
by: Lee, Yao-Chih, et al.
Published: (2024)

VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer
by: Liu, Xinyu, et al.
Published: (2025)

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space
by: Garibi, Daniel, et al.
Published: (2025)

DynFrame: Adaptive Reasoning-Driven Multimodal Framework with Dynamic Frame Augmentation for Complex Video Understanding
by: Zhang, Peng, et al.
Published: (2026)

DRoPS: Dynamic 3D Reconstruction of Pre-Scanned Objects
by: Tumanyan, Narek, et al.
Published: (2026)

DynTok: Dynamic Compression of Visual Tokens for Efficient and Effective Video Understanding
by: Zhang, Hongzhi, et al.
Published: (2025)

A texture-based framework for foundational ultrasound models
by: Grutman, Tal, et al.
Published: (2026)

PromptVFX: Text-Driven Fields for Open-World 3D Gaussian Animation
by: Kiray, Mert, et al.
Published: (2025)

Osmosis: RGBD Diffusion Prior for Underwater Image Restoration
by: Nathan, Opher Bar, et al.
Published: (2024)

AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
by: Chai, Wenhao, et al.
Published: (2024)

DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning
by: Qian, Chengxuan, et al.
Published: (2025)

DynPoint: Dynamic Neural Point For View Synthesis
by: Zhou, Kaichen, et al.
Published: (2023)

DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding
by: Han, Yudong, et al.
Published: (2024)

Adversarial Robustness of Discriminative Self-Supervised Learning in Vision
by: Çağatan, Ömer Veysel, et al.
Published: (2025)

DynProto: Dynamic Prototype Evolution for Out-of-Distribution Detection
by: Wu, Yanqi, et al.
Published: (2026)

DynAlign: Unsupervised Dynamic Taxonomy Alignment for Cross-Domain Segmentation
by: Sun, Han, et al.
Published: (2025)

HOI-Dyn: Learning Interaction Dynamics for Human-Object Motion Diffusion
by: Wu, Lin, et al.
Published: (2025)

Dyn-E: Local Appearance Editing of Dynamic Neural Radiance Fields
by: Zhang, Shangzan, et al.
Published: (2023)

DynSUP: Dynamic Gaussian Splatting from An Unposed Image Pair
by: Li, Weihang, et al.
Published: (2024)

DynMF: Neural Motion Factorization for Real-time Dynamic View Synthesis with 3D Gaussian Splatting
by: Kratimenos, Agelos, et al.
Published: (2023)

DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data
by: Fu, Stephanie, et al.
Published: (2023)

ChatDyn: Language-Driven Multi-Actor Dynamics Generation in Street Scenes
by: Wei, Yuxi, et al.
Published: (2024)

DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution
by: Zhao, Yuzhong, et al.
Published: (2024)

DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction
by: Seidenschwarz, Jenny, et al.
Published: (2024)

VidPanos: Generative Panoramic Videos from Casual Panning Videos
by: Ma, Jingwei, et al.
Published: (2024)

DynImg: Key Frames with Visual Prompts are Good Representation for Multi-Modal Video Understanding
by: Bao, Xiaoyi, et al.
Published: (2025)

DynVLA: Learning World Dynamics for Action Reasoning in Autonomous Driving
by: Shang, Shuyao, et al.
Published: (2026)

DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles
by: Daniel, Tal, et al.
Published: (2023)