Saved in:
| Main Authors: | Wu, Yinwei, Yang, Xingyi, Wang, Xinchao |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.20249 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Language Model as Visual Explainer
by: Yang, Xingyi, et al.
Published: (2024)
by: Yang, Xingyi, et al.
Published: (2024)
Hash3D: Training-free Acceleration for 3D Generation
by: Yang, Xingyi, et al.
Published: (2024)
by: Yang, Xingyi, et al.
Published: (2024)
Compositional Video Generation as Flow Equalization
by: Yang, Xingyi, et al.
Published: (2024)
by: Yang, Xingyi, et al.
Published: (2024)
Image Editing As Programs with Diffusion Models
by: Hu, Yujia, et al.
Published: (2025)
by: Hu, Yujia, et al.
Published: (2025)
Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling
by: Kong, Hanyang, et al.
Published: (2025)
by: Kong, Hanyang, et al.
Published: (2025)
Q-ARVD: Quantizing Autoregressive Video Diffusion Models
by: Tang, Siao, et al.
Published: (2026)
by: Tang, Siao, et al.
Published: (2026)
Unsegment Anything by Simulating Deformation
by: Lu, Jiahao, et al.
Published: (2024)
by: Lu, Jiahao, et al.
Published: (2024)
Neural Metamorphosis
by: Yang, Xingyi, et al.
Published: (2024)
by: Yang, Xingyi, et al.
Published: (2024)
WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion
by: Kong, Hanyang, et al.
Published: (2025)
by: Kong, Hanyang, et al.
Published: (2025)
Kolmogorov-Arnold Transformer
by: Yang, Xingyi, et al.
Published: (2024)
by: Yang, Xingyi, et al.
Published: (2024)
OminiControl2: Efficient Conditioning for Diffusion Transformers
by: Tan, Zhenxiong, et al.
Published: (2025)
by: Tan, Zhenxiong, et al.
Published: (2025)
Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation
by: Li, Bonan, et al.
Published: (2024)
by: Li, Bonan, et al.
Published: (2024)
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
by: Yuan, Yuheng, et al.
Published: (2025)
by: Yuan, Yuheng, et al.
Published: (2025)
Flash Sculptor: Modular 3D Worlds from Objects
by: Hu, Yujia, et al.
Published: (2025)
by: Hu, Yujia, et al.
Published: (2025)
FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally
by: Shen, Qiuhong, et al.
Published: (2024)
by: Shen, Qiuhong, et al.
Published: (2024)
C4D: 4D Made from 3D through Dual Correspondences
by: Wang, Shizun, et al.
Published: (2025)
by: Wang, Shizun, et al.
Published: (2025)
OminiControl: Minimal and Universal Control for Diffusion Transformer
by: Tan, Zhenxiong, et al.
Published: (2024)
by: Tan, Zhenxiong, et al.
Published: (2024)
Test3R: Learning to Reconstruct 3D at Test Time
by: Yuan, Yuheng, et al.
Published: (2025)
by: Yuan, Yuheng, et al.
Published: (2025)
Video-Infinity: Distributed Long Video Generation
by: Tan, Zhenxiong, et al.
Published: (2024)
by: Tan, Zhenxiong, et al.
Published: (2024)
Few-shot Implicit Function Generation via Equivariance
by: Huang, Suizhi, et al.
Published: (2025)
by: Huang, Suizhi, et al.
Published: (2025)
IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation
by: Wu, Yinwei, et al.
Published: (2024)
by: Wu, Yinwei, et al.
Published: (2024)
Minute-Long Videos with Dual Parallelisms
by: Wang, Zeqing, et al.
Published: (2025)
by: Wang, Zeqing, et al.
Published: (2025)
GFlow: Recovering 4D World from Monocular Video
by: Wang, Shizun, et al.
Published: (2024)
by: Wang, Shizun, et al.
Published: (2024)
Vision Bridge Transformer at Scale
by: Tan, Zhenxiong, et al.
Published: (2025)
by: Tan, Zhenxiong, et al.
Published: (2025)
StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization
by: Liu, Songhua, et al.
Published: (2024)
by: Liu, Songhua, et al.
Published: (2024)
ERTACache: Error Rectification and Timesteps Adjustment for Efficient Diffusion
by: Peng, Xurui, et al.
Published: (2025)
by: Peng, Xurui, et al.
Published: (2025)
DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation
by: Shen, Sitian, et al.
Published: (2024)
by: Shen, Sitian, et al.
Published: (2024)
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
by: Yu, Runpeng, et al.
Published: (2025)
by: Yu, Runpeng, et al.
Published: (2025)
CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation
by: Zhou, Letian, et al.
Published: (2025)
by: Zhou, Letian, et al.
Published: (2025)
Vista3D: Unravel the 3D Darkside of a Single Image
by: Shen, Qiuhong, et al.
Published: (2024)
by: Shen, Qiuhong, et al.
Published: (2024)
DREAM: Diffusion Rectification and Estimation-Adaptive Models
by: Zhou, Jinxin, et al.
Published: (2023)
by: Zhou, Jinxin, et al.
Published: (2023)
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up
by: Liu, Songhua, et al.
Published: (2024)
by: Liu, Songhua, et al.
Published: (2024)
DeContext as Defense: Safe Image Editing in Diffusion Transformers
by: Shen, Linghui, et al.
Published: (2025)
by: Shen, Linghui, et al.
Published: (2025)
DreamDrone: Text-to-Image Diffusion Models are Zero-shot Perpetual View Generators
by: Kong, Hanyang, et al.
Published: (2023)
by: Kong, Hanyang, et al.
Published: (2023)
PRISM: Prior Rectification and Uncertainty-Aware Structure Modeling for Diffusion-Based Text Image Super-Resolution
by: Xu, Zihang, et al.
Published: (2026)
by: Xu, Zihang, et al.
Published: (2026)
Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising
by: Fang, Gongfan, et al.
Published: (2024)
by: Fang, Gongfan, et al.
Published: (2024)
TARO: Temporal Adversarial Rectification Optimization Using Diffusion Models as Purifiers
by: Wesego, Daniel, et al.
Published: (2026)
by: Wesego, Daniel, et al.
Published: (2026)
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
by: Chen, Zigeng, et al.
Published: (2024)
by: Chen, Zigeng, et al.
Published: (2024)
Neural Lineage
by: Yu, Runpeng, et al.
Published: (2024)
by: Yu, Runpeng, et al.
Published: (2024)
Sponge Tool Attack: Stealthy Denial-of-Efficiency against Tool-Augmented Agentic Reasoning
by: Li, Qi, et al.
Published: (2026)
by: Li, Qi, et al.
Published: (2026)
Similar Items
-
Language Model as Visual Explainer
by: Yang, Xingyi, et al.
Published: (2024) -
Hash3D: Training-free Acceleration for 3D Generation
by: Yang, Xingyi, et al.
Published: (2024) -
Compositional Video Generation as Flow Equalization
by: Yang, Xingyi, et al.
Published: (2024) -
Image Editing As Programs with Diffusion Models
by: Hu, Yujia, et al.
Published: (2025) -
Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling
by: Kong, Hanyang, et al.
Published: (2025)