Saved in:
| Main Authors: | Bahmani, Sherwin, Liu, Xian, Yifan, Wang, Skorokhodov, Ivan, Rong, Victor, Liu, Ziwei, Liu, Xihui, Park, Jeong Joon, Tulyakov, Sergey, Wetzstein, Gordon, Tagliasacchi, Andrea, Lindell, David B. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.17920 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
by: Bahmani, Sherwin, et al.
Published: (2023)
by: Bahmani, Sherwin, et al.
Published: (2023)
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
by: Bahmani, Sherwin, et al.
Published: (2024)
by: Bahmani, Sherwin, et al.
Published: (2024)
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
by: Bahmani, Sherwin, et al.
Published: (2024)
by: Bahmani, Sherwin, et al.
Published: (2024)
Grow with the Flow: 4D Reconstruction of Growing Plants with Gaussian Flow Fields
by: Luo, Weihan, et al.
Published: (2026)
by: Luo, Weihan, et al.
Published: (2026)
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
by: Liu, Xian, et al.
Published: (2023)
by: Liu, Xian, et al.
Published: (2023)
MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars
by: Taubner, Felix, et al.
Published: (2025)
by: Taubner, Felix, et al.
Published: (2025)
GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling
by: Rong, Victor, et al.
Published: (2024)
by: Rong, Victor, et al.
Published: (2024)
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
by: Skorokhodov, Ivan, et al.
Published: (2024)
by: Skorokhodov, Ivan, et al.
Published: (2024)
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
by: Namekata, Koichi, et al.
Published: (2024)
by: Namekata, Koichi, et al.
Published: (2024)
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
by: Wu, Tong, et al.
Published: (2024)
by: Wu, Tong, et al.
Published: (2024)
4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
by: Wang, Chaoyang, et al.
Published: (2024)
by: Wang, Chaoyang, et al.
Published: (2024)
HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
by: Liu, Xian, et al.
Published: (2023)
by: Liu, Xian, et al.
Published: (2023)
4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
by: Wang, Chaoyang, et al.
Published: (2025)
by: Wang, Chaoyang, et al.
Published: (2025)
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
by: Bahmani, Sherwin, et al.
Published: (2025)
by: Bahmani, Sherwin, et al.
Published: (2025)
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography
by: Zhang, Mengchen, et al.
Published: (2025)
by: Zhang, Mengchen, et al.
Published: (2025)
Helix4D: Complex 4D Mesh Generation
by: Yenphraphai, Jiraphon, et al.
Published: (2026)
by: Yenphraphai, Jiraphon, et al.
Published: (2026)
AlphaFlow: Understanding and Improving MeanFlow Models
by: Zhang, Huijie, et al.
Published: (2025)
by: Zhang, Huijie, et al.
Published: (2025)
3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models
by: Zhang, Yuhan, et al.
Published: (2025)
by: Zhang, Yuhan, et al.
Published: (2025)
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
by: Yang, Shuai, et al.
Published: (2024)
by: Yang, Shuai, et al.
Published: (2024)
Personalized Text-to-Image Generation with Auto-Regressive Models
by: Sun, Kaiyue, et al.
Published: (2025)
by: Sun, Kaiyue, et al.
Published: (2025)
Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials
by: Fang, Ye, et al.
Published: (2024)
by: Fang, Ye, et al.
Published: (2024)
Improving Progressive Generation with Decomposable Flow Matching
by: Haji-Ali, Moayed, et al.
Published: (2025)
by: Haji-Ali, Moayed, et al.
Published: (2025)
VIA: Unified Spatiotemporal Video Adaptation Framework for Global and Local Video Editing
by: Gu, Jing, et al.
Published: (2024)
by: Gu, Jing, et al.
Published: (2024)
Dynamic Concepts Personalization from Single Videos
by: Abdal, Rameen, et al.
Published: (2025)
by: Abdal, Rameen, et al.
Published: (2025)
AToM: Amortized Text-to-Mesh using 2D Diffusion
by: Qian, Guocheng, et al.
Published: (2024)
by: Qian, Guocheng, et al.
Published: (2024)
Interspatial Attention for Efficient 4D Human Video Generation
by: Shao, Ruizhi, et al.
Published: (2025)
by: Shao, Ruizhi, et al.
Published: (2025)
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
by: Wu, Ziyi, et al.
Published: (2025)
by: Wu, Ziyi, et al.
Published: (2025)
Improving the Diffusability of Autoencoders
by: Skorokhodov, Ivan, et al.
Published: (2025)
by: Skorokhodov, Ivan, et al.
Published: (2025)
Mind the Time: Temporally-Controlled Multi-Event Video Generation
by: Wu, Ziyi, et al.
Published: (2024)
by: Wu, Ziyi, et al.
Published: (2024)
One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers
by: Haji-Ali, Moayed, et al.
Published: (2026)
by: Haji-Ali, Moayed, et al.
Published: (2026)
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
by: Menapace, Willi, et al.
Published: (2024)
by: Menapace, Willi, et al.
Published: (2024)
Adaptive 1D Video Diffusion Autoencoder
by: Teng, Yao, et al.
Published: (2026)
by: Teng, Yao, et al.
Published: (2026)
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
by: Sun, Kaiyue, et al.
Published: (2025)
by: Sun, Kaiyue, et al.
Published: (2025)
Garment Particles: A 2D--3D Symmetric Garment Representation for Generation and Editing
by: Nakayama, Kiyohiro, et al.
Published: (2026)
by: Nakayama, Kiyohiro, et al.
Published: (2026)
TextCraftor: Your Text Encoder Can be Image Quality Controller
by: Li, Yanyu, et al.
Published: (2024)
by: Li, Yanyu, et al.
Published: (2024)
4Diffusion: Multi-view Video Diffusion Model for 4D Generation
by: Zhang, Haiyu, et al.
Published: (2024)
by: Zhang, Haiyu, et al.
Published: (2024)
VIMI: Grounding Video Generation through Multi-modal Instruction
by: Fang, Yuwei, et al.
Published: (2024)
by: Fang, Yuwei, et al.
Published: (2024)
AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
by: Haji-Ali, Moayed, et al.
Published: (2024)
by: Haji-Ali, Moayed, et al.
Published: (2024)
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
by: Wu, Yushu, et al.
Published: (2025)
by: Wu, Yushu, et al.
Published: (2025)
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models
by: Taubner, Felix, et al.
Published: (2024)
by: Taubner, Felix, et al.
Published: (2024)
Similar Items
-
4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling
by: Bahmani, Sherwin, et al.
Published: (2023) -
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
by: Bahmani, Sherwin, et al.
Published: (2024) -
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
by: Bahmani, Sherwin, et al.
Published: (2024) -
Grow with the Flow: 4D Reconstruction of Growing Plants with Gaussian Flow Fields
by: Luo, Weihan, et al.
Published: (2026) -
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
by: Liu, Xian, et al.
Published: (2023)