Saved in:
| Main Authors: | Wu, Xian, Liu, Chang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.15661 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EraserDiT: Fast Video Inpainting with Diffusion Transformer Model
by: Liu, Jie, et al.
Published: (2025)
by: Liu, Jie, et al.
Published: (2025)
iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer
by: Shen, Zhelun, et al.
Published: (2025)
by: Shen, Zhelun, et al.
Published: (2025)
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
by: Ding, Hangliang, et al.
Published: (2025)
by: Ding, Hangliang, et al.
Published: (2025)
FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers
by: He, Xuanhua, et al.
Published: (2025)
by: He, Xuanhua, et al.
Published: (2025)
SparseDiT: Token Sparsification for Efficient Diffusion Transformer
by: Chang, Shuning, et al.
Published: (2024)
by: Chang, Shuning, et al.
Published: (2024)
FrameDiT: Diffusion Transformer with Matrix Attention for Efficient Video Generation
by: Le, Minh Khoa, et al.
Published: (2026)
by: Le, Minh Khoa, et al.
Published: (2026)
DiTVR: Zero-Shot Diffusion Transformer for Video Restoration
by: Gao, Sicheng, et al.
Published: (2025)
by: Gao, Sicheng, et al.
Published: (2025)
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
by: Zhao, Tianchen, et al.
Published: (2024)
by: Zhao, Tianchen, et al.
Published: (2024)
Flow-Guided Diffusion for Video Inpainting
by: Gu, Bohai, et al.
Published: (2023)
by: Gu, Bohai, et al.
Published: (2023)
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation
by: Wang, Kai, et al.
Published: (2024)
by: Wang, Kai, et al.
Published: (2024)
Coherent Video Inpainting Using Optical Flow-Guided Efficient Diffusion
by: Gu, Bohai, et al.
Published: (2024)
by: Gu, Bohai, et al.
Published: (2024)
AVID: Any-Length Video Inpainting with Diffusion Model
by: Zhang, Zhixing, et al.
Published: (2023)
by: Zhang, Zhixing, et al.
Published: (2023)
S2DiT: Sandwich Diffusion Transformer for Mobile Streaming Video Generation
by: Zhao, Lin, et al.
Published: (2026)
by: Zhao, Lin, et al.
Published: (2026)
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer
by: Jiang, Junpeng, et al.
Published: (2025)
by: Jiang, Junpeng, et al.
Published: (2025)
Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting
by: Zhang, Kaidong, et al.
Published: (2023)
by: Zhang, Kaidong, et al.
Published: (2023)
DiffuEraser: A Diffusion Model for Video Inpainting
by: Li, Xiaowen, et al.
Published: (2025)
by: Li, Xiaowen, et al.
Published: (2025)
Learnable Gated Temporal Shift Module for Deep Video Inpainting
by: Chang, Ya-Liang, et al.
Published: (2019)
by: Chang, Ya-Liang, et al.
Published: (2019)
LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Image and Video Generation
by: Yang, Lianwei, et al.
Published: (2025)
by: Yang, Lianwei, et al.
Published: (2025)
Human4DiT: 360-degree Human Video Generation with 4D Diffusion Transformer
by: Shao, Ruizhi, et al.
Published: (2024)
by: Shao, Ruizhi, et al.
Published: (2024)
DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer
by: Lyu, Hengye, et al.
Published: (2026)
by: Lyu, Hengye, et al.
Published: (2026)
HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization
by: Liu, Wenxuan, et al.
Published: (2024)
by: Liu, Wenxuan, et al.
Published: (2024)
MTV-Inpaint: Multi-Task Long Video Inpainting
by: Yang, Shiyuan, et al.
Published: (2025)
by: Yang, Shiyuan, et al.
Published: (2025)
LuxDiT: Lighting Estimation with Video Diffusion Transformer
by: Liang, Ruofan, et al.
Published: (2025)
by: Liang, Ruofan, et al.
Published: (2025)
Semantically Consistent Video Inpainting with Conditional Diffusion Models
by: Green, Dylan, et al.
Published: (2024)
by: Green, Dylan, et al.
Published: (2024)
Geometric Image Editing via Effects-Sensitive In-Context Inpainting with Diffusion Transformers
by: Zhang, Shuo, et al.
Published: (2026)
by: Zhang, Shuo, et al.
Published: (2026)
PixelDiT: Pixel Diffusion Transformers for Image Generation
by: Yu, Yongsheng, et al.
Published: (2025)
by: Yu, Yongsheng, et al.
Published: (2025)
VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers
by: Zheng, Jun, et al.
Published: (2024)
by: Zheng, Jun, et al.
Published: (2024)
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
by: Ju, Xuan, et al.
Published: (2024)
by: Ju, Xuan, et al.
Published: (2024)
PTQ4DiT: Post-training Quantization for Diffusion Transformers
by: Wu, Junyi, et al.
Published: (2024)
by: Wu, Junyi, et al.
Published: (2024)
LaVin-DiT: Large Vision Diffusion Transformer
by: Wang, Zhaoqing, et al.
Published: (2024)
by: Wang, Zhaoqing, et al.
Published: (2024)
DyDiT++: Diffusion Transformers with Timestep and Spatial Dynamics for Efficient Visual Generation
by: Zhao, Wangbo, et al.
Published: (2025)
by: Zhao, Wangbo, et al.
Published: (2025)
Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer
by: Yang, Zhuoyi, et al.
Published: (2024)
by: Yang, Zhuoyi, et al.
Published: (2024)
MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting
by: Huang, Jun, et al.
Published: (2025)
by: Huang, Jun, et al.
Published: (2025)
NeRF Inpainting with Geometric Diffusion Prior and Balanced Score Distillation
by: Zhang, Menglin, et al.
Published: (2024)
by: Zhang, Menglin, et al.
Published: (2024)
Transformer-based Image and Video Inpainting: Current Challenges and Future Directions
by: Elharrouss, Omar, et al.
Published: (2024)
by: Elharrouss, Omar, et al.
Published: (2024)
Towards Online Real-Time Memory-based Video Inpainting Transformers
by: Thiry, Guillaume, et al.
Published: (2024)
by: Thiry, Guillaume, et al.
Published: (2024)
Mumpy: Multilateral Temporal-view Pyramid Transformer for Video Inpainting Detection
by: Zhang, Ying, et al.
Published: (2024)
by: Zhang, Ying, et al.
Published: (2024)
Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
by: Qi, Tianhao, et al.
Published: (2025)
by: Qi, Tianhao, et al.
Published: (2025)
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
by: Sun, Yasheng, et al.
Published: (2025)
by: Sun, Yasheng, et al.
Published: (2025)
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
by: Teng, Yao, et al.
Published: (2024)
by: Teng, Yao, et al.
Published: (2024)
Similar Items
-
EraserDiT: Fast Video Inpainting with Diffusion Transformer Model
by: Liu, Jie, et al.
Published: (2025) -
iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer
by: Shen, Zhelun, et al.
Published: (2025) -
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
by: Ding, Hangliang, et al.
Published: (2025) -
FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers
by: He, Xuanhua, et al.
Published: (2025) -
SparseDiT: Token Sparsification for Efficient Diffusion Transformer
by: Chang, Shuning, et al.
Published: (2024)