:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bahmani, Sherwin, Liu, Xian, Yifan, Wang, Skorokhodov, Ivan, Rong, Victor, Liu, Ziwei, Liu, Xihui, Park, Jeong Joon, Tulyakov, Sergey, Wetzstein, Gordon, Tagliasacchi, Andrea, Lindell, David B.
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2403.17920
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
by: Bahmani, Sherwin, et al.
Published: (2023)

AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
by: Bahmani, Sherwin, et al.
Published: (2024)

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
by: Bahmani, Sherwin, et al.
Published: (2024)

Grow with the Flow: 4D Reconstruction of Growing Plants with Gaussian Flow Fields
by: Luo, Weihan, et al.
Published: (2026)

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
by: Liu, Xian, et al.
Published: (2023)

MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars
by: Taubner, Felix, et al.
Published: (2025)

GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling
by: Rong, Victor, et al.
Published: (2024)

Hierarchical Patch Diffusion Models for High-Resolution Video Generation
by: Skorokhodov, Ivan, et al.
Published: (2024)

SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
by: Namekata, Koichi, et al.
Published: (2024)

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
by: Wu, Tong, et al.
Published: (2024)

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
by: Wang, Chaoyang, et al.
Published: (2024)

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
by: Liu, Xian, et al.
Published: (2023)

4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
by: Wang, Chaoyang, et al.
Published: (2025)

Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
by: Bahmani, Sherwin, et al.
Published: (2025)

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography
by: Zhang, Mengchen, et al.
Published: (2025)

Helix4D: Complex 4D Mesh Generation
by: Yenphraphai, Jiraphon, et al.
Published: (2026)

AlphaFlow: Understanding and Improving MeanFlow Models
by: Zhang, Huijie, et al.
Published: (2025)

3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models
by: Zhang, Yuhan, et al.
Published: (2025)

LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
by: Yang, Shuai, et al.
Published: (2024)

Personalized Text-to-Image Generation with Auto-Regressive Models
by: Sun, Kaiyue, et al.
Published: (2025)

Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials
by: Fang, Ye, et al.
Published: (2024)

Improving Progressive Generation with Decomposable Flow Matching
by: Haji-Ali, Moayed, et al.
Published: (2025)

VIA: Unified Spatiotemporal Video Adaptation Framework for Global and Local Video Editing
by: Gu, Jing, et al.
Published: (2024)

Dynamic Concepts Personalization from Single Videos
by: Abdal, Rameen, et al.
Published: (2025)

AToM: Amortized Text-to-Mesh using 2D Diffusion
by: Qian, Guocheng, et al.
Published: (2024)

Interspatial Attention for Efficient 4D Human Video Generation
by: Shao, Ruizhi, et al.
Published: (2025)

DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
by: Wu, Ziyi, et al.
Published: (2025)

Improving the Diffusability of Autoencoders
by: Skorokhodov, Ivan, et al.
Published: (2025)

Mind the Time: Temporally-Controlled Multi-Event Video Generation
by: Wu, Ziyi, et al.
Published: (2024)

One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers
by: Haji-Ali, Moayed, et al.
Published: (2026)

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
by: Menapace, Willi, et al.
Published: (2024)

Adaptive 1D Video Diffusion Autoencoder
by: Teng, Yao, et al.
Published: (2026)

T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
by: Sun, Kaiyue, et al.
Published: (2025)

Garment Particles: A 2D--3D Symmetric Garment Representation for Generation and Editing
by: Nakayama, Kiyohiro, et al.
Published: (2026)

TextCraftor: Your Text Encoder Can be Image Quality Controller
by: Li, Yanyu, et al.
Published: (2024)

4Diffusion: Multi-view Video Diffusion Model for 4D Generation
by: Zhang, Haiyu, et al.
Published: (2024)

VIMI: Grounding Video Generation through Multi-modal Instruction
by: Fang, Yuwei, et al.
Published: (2024)

AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
by: Haji-Ali, Moayed, et al.
Published: (2024)

H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
by: Wu, Yushu, et al.
Published: (2025)

CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models
by: Taubner, Felix, et al.
Published: (2024)