Saved in:
| Main Authors: | Chen, Yunuo, Cao, Junli, Goel, Vidit, Korolev, Sergei, Jiang, Chenfanfu, Ren, Jian, Tulyakov, Sergey, Kag, Anil |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.03639 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Lightweight Predictive 3D Gaussian Splats
by: Cao, Junli, et al.
Published: (2024)
by: Cao, Junli, et al.
Published: (2024)
Wonderland: Navigating 3D Scenes from a Single Image
by: Liang, Hanwen, et al.
Published: (2024)
by: Liang, Hanwen, et al.
Published: (2024)
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
by: Kag, Anil, et al.
Published: (2024)
by: Kag, Anil, et al.
Published: (2024)
EFlow: Fast Few-Step Video Generator Training from Scratch via Efficient Solution Flow
by: Park, Dogyun, et al.
Published: (2026)
by: Park, Dogyun, et al.
Published: (2026)
Scalable Ranked Preference Optimization for Text-to-Image Generation
by: Karthik, Shyamgopal, et al.
Published: (2024)
by: Karthik, Shyamgopal, et al.
Published: (2024)
SF-V: Single Forward Video Generation Model
by: Zhang, Zhixing, et al.
Published: (2024)
by: Zhang, Zhixing, et al.
Published: (2024)
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
by: Sui, Yang, et al.
Published: (2024)
by: Sui, Yang, et al.
Published: (2024)
4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
by: Wang, Chaoyang, et al.
Published: (2025)
by: Wang, Chaoyang, et al.
Published: (2025)
Diffusion-DRF: Free, Rich, and Differentiable Reward for Video Diffusion Fine-Tuning
by: Wang, Yifan, et al.
Published: (2026)
by: Wang, Yifan, et al.
Published: (2026)
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
by: Menapace, Willi, et al.
Published: (2024)
by: Menapace, Willi, et al.
Published: (2024)
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
by: Wu, Ziyi, et al.
Published: (2025)
by: Wu, Ziyi, et al.
Published: (2025)
TextCraftor: Your Text Encoder Can be Image Quality Controller
by: Li, Yanyu, et al.
Published: (2024)
by: Li, Yanyu, et al.
Published: (2024)
H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
by: Wu, Yushu, et al.
Published: (2025)
by: Wu, Yushu, et al.
Published: (2025)
4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models
by: Yu, Heng, et al.
Published: (2024)
by: Yu, Heng, et al.
Published: (2024)
AnimaMimic: Imitating 3D Animation from Video Priors
by: Xie, Tianyi, et al.
Published: (2025)
by: Xie, Tianyi, et al.
Published: (2025)
Substepping the Material Point Method
by: Jiang, Chenfanfu
Published: (2025)
by: Jiang, Chenfanfu
Published: (2025)
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
by: Wu, Yushu, et al.
Published: (2024)
by: Wu, Yushu, et al.
Published: (2024)
Taming Diffusion Transformer for Efficient Mobile Video Generation in Seconds
by: Wu, Yushu, et al.
Published: (2025)
by: Wu, Yushu, et al.
Published: (2025)
Dress-1-to-3: Single Image to Simulation-Ready 3D Outfit with Diffusion Prior and Differentiable Physics
by: Li, Xuan, et al.
Published: (2025)
by: Li, Xuan, et al.
Published: (2025)
Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication
by: Chen, Yunuo, et al.
Published: (2024)
by: Chen, Yunuo, et al.
Published: (2024)
Diffusion Priors for Dynamic View Synthesis from Monocular Videos
by: Wang, Chaoyang, et al.
Published: (2024)
by: Wang, Chaoyang, et al.
Published: (2024)
Sprint: Sparse-Dense Residual Fusion for Efficient Diffusion Transformers
by: Park, Dogyun, et al.
Published: (2025)
by: Park, Dogyun, et al.
Published: (2025)
S2DiT: Sandwich Diffusion Transformer for Mobile Streaming Video Generation
by: Zhao, Lin, et al.
Published: (2026)
by: Zhao, Lin, et al.
Published: (2026)
Learn2Fold: Structured Origami Generation with World Model Planning
by: Huang, Yanjia, et al.
Published: (2026)
by: Huang, Yanjia, et al.
Published: (2026)
One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers
by: Haji-Ali, Moayed, et al.
Published: (2026)
by: Haji-Ali, Moayed, et al.
Published: (2026)
EMPM: Embodied MPM for Modeling and Simulation of Deformable Objects
by: Chen, Yunuo, et al.
Published: (2026)
by: Chen, Yunuo, et al.
Published: (2026)
Hierarchical Patch Diffusion Models for High-Resolution Video Generation
by: Skorokhodov, Ivan, et al.
Published: (2024)
by: Skorokhodov, Ivan, et al.
Published: (2024)
Birth of a Painting: Differentiable Brushstroke Reconstruction
by: Jiang, Ying, et al.
Published: (2025)
by: Jiang, Ying, et al.
Published: (2025)
ShapeGen4D: Towards High Quality 4D Shape Generation from Videos
by: Yenphraphai, Jiraphon, et al.
Published: (2025)
by: Yenphraphai, Jiraphon, et al.
Published: (2025)
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training
by: Hu, Dongting, et al.
Published: (2024)
by: Hu, Dongting, et al.
Published: (2024)
MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
by: Pinyoanuntapong, Ekkasit, et al.
Published: (2024)
by: Pinyoanuntapong, Ekkasit, et al.
Published: (2024)
Unstructured Moving Least Squares Material Point Methods: A Stable Kernel Approach With Continuous Gradient Reconstruction on General Unstructured Tessellations
by: Cao, Yadi, et al.
Published: (2023)
by: Cao, Yadi, et al.
Published: (2023)
PhysAnimator: Physics-Guided Generative Cartoon Animation
by: Xie, Tianyi, et al.
Published: (2025)
by: Xie, Tianyi, et al.
Published: (2025)
A Convex Formulation of Frictional Contact for the Material Point Method and Rigid Bodies
by: Zong, Zeshun, et al.
Published: (2024)
by: Zong, Zeshun, et al.
Published: (2024)
Producing Histopathology Phantom Images using Generative Adversarial Networks to improve Tumor Detection
by: Gautam, Vidit
Published: (2022)
by: Gautam, Vidit
Published: (2022)
Can Text-to-Video Generation help Video-Language Alignment?
by: Zanella, Luca, et al.
Published: (2025)
by: Zanella, Luca, et al.
Published: (2025)
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement
by: Zhuang, Peiye, et al.
Published: (2024)
by: Zhuang, Peiye, et al.
Published: (2024)
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
by: Rahman, Tanzila, et al.
Published: (2024)
by: Rahman, Tanzila, et al.
Published: (2024)
PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
by: Xie, Tianyi, et al.
Published: (2023)
by: Xie, Tianyi, et al.
Published: (2023)
VASE: Object-Centric Appearance and Shape Manipulation of Real Videos
by: Peruzzo, Elia, et al.
Published: (2024)
by: Peruzzo, Elia, et al.
Published: (2024)
Similar Items
-
Lightweight Predictive 3D Gaussian Splats
by: Cao, Junli, et al.
Published: (2024) -
Wonderland: Navigating 3D Scenes from a Single Image
by: Liang, Hanwen, et al.
Published: (2024) -
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
by: Kag, Anil, et al.
Published: (2024) -
EFlow: Fast Few-Step Video Generator Training from Scratch via Efficient Solution Flow
by: Park, Dogyun, et al.
Published: (2026) -
Scalable Ranked Preference Optimization for Text-to-Image Generation
by: Karthik, Shyamgopal, et al.
Published: (2024)