Saved in:
| Main Authors: | Wang, Yutong, Zhang, Haiyu, Xue, Tianfan, Qiao, Yu, Wang, Yaohui, Xu, Chang, Chen, Xinyuan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.06802 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PARE: Pruning and Adaptive Routing for Efficient Video Generation
by: Wang, Yutong, et al.
Published: (2026)
by: Wang, Yutong, et al.
Published: (2026)
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
by: Zhang, Haiyu, et al.
Published: (2025)
by: Zhang, Haiyu, et al.
Published: (2025)
4Diffusion: Multi-view Video Diffusion Model for 4D Generation
by: Zhang, Haiyu, et al.
Published: (2024)
by: Zhang, Haiyu, et al.
Published: (2024)
GenHOI: Generalizing Text-driven 4D Human-Object Interaction Synthesis for Unseen Objects
by: Li, Shujia, et al.
Published: (2025)
by: Li, Shujia, et al.
Published: (2025)
ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions
by: Wu, Xiaoxue, et al.
Published: (2025)
by: Wu, Xiaoxue, et al.
Published: (2025)
CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models
by: Wu, Xiaoxue, et al.
Published: (2025)
by: Wu, Xiaoxue, et al.
Published: (2025)
ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation
by: Peng, Bo, et al.
Published: (2023)
by: Peng, Bo, et al.
Published: (2023)
LEO: Generative Latent Image Animator for Human Video Synthesis
by: Wang, Yaohui, et al.
Published: (2023)
by: Wang, Yaohui, et al.
Published: (2023)
BEAT: Rhythm-Elastic Alignment for Agentic Music-guided Movie Trailer Generation
by: Wang, Yutong, et al.
Published: (2026)
by: Wang, Yutong, et al.
Published: (2026)
Latte: Latent Diffusion Transformer for Video Generation
by: Ma, Xin, et al.
Published: (2024)
by: Ma, Xin, et al.
Published: (2024)
LIA-X: Interpretable Latent Portrait Animator
by: Wang, Yaohui, et al.
Published: (2025)
by: Wang, Yaohui, et al.
Published: (2025)
WSVD: Weighted Low-Rank Approximation for Fast and Efficient Execution of Low-Precision Vision-Language Models
by: Wang, Haiyu, et al.
Published: (2026)
by: Wang, Haiyu, et al.
Published: (2026)
The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation
by: Gao, Bingjie, et al.
Published: (2025)
by: Gao, Bingjie, et al.
Published: (2025)
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
by: Wen, Hao, et al.
Published: (2024)
by: Wen, Hao, et al.
Published: (2024)
HDRFlow: Real-Time HDR Video Reconstruction with Large Motions
by: Xu, Gangwei, et al.
Published: (2024)
by: Xu, Gangwei, et al.
Published: (2024)
Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models
by: Ma, Xin, et al.
Published: (2024)
by: Ma, Xin, et al.
Published: (2024)
RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling
by: Gao, Bingjie, et al.
Published: (2025)
by: Gao, Bingjie, et al.
Published: (2025)
PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback
by: Chen, Sixiang, et al.
Published: (2026)
by: Chen, Sixiang, et al.
Published: (2026)
Training-free Stylized Text-to-Image Generation with Fast Inference
by: Ma, Xin, et al.
Published: (2025)
by: Ma, Xin, et al.
Published: (2025)
LoRA-Edit: Controllable First-Frame-Guided Video Editing via Mask-Aware LoRA Fine-Tuning
by: Gao, Chenjian, et al.
Published: (2025)
by: Gao, Chenjian, et al.
Published: (2025)
Vlogger: Make Your Dream A Vlog
by: Zhuang, Shaobin, et al.
Published: (2024)
by: Zhuang, Shaobin, et al.
Published: (2024)
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
by: Huang, Ziqi, et al.
Published: (2024)
by: Huang, Ziqi, et al.
Published: (2024)
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
by: Wang, Yi, et al.
Published: (2023)
by: Wang, Yi, et al.
Published: (2023)
Memory-Efficient Transfer Learning with Fading Side Networks via Masked Dual Path Distillation
by: Zhang, Yutong, et al.
Published: (2026)
by: Zhang, Yutong, et al.
Published: (2026)
EGVD: Event-Guided Video Diffusion Model for Physically Realistic Large-Motion Frame Interpolation
by: Zhang, Ziran, et al.
Published: (2025)
by: Zhang, Ziran, et al.
Published: (2025)
TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision
by: Zhuang, Shaobin, et al.
Published: (2025)
by: Zhuang, Shaobin, et al.
Published: (2025)
AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection
by: Wang, Yujin, et al.
Published: (2024)
by: Wang, Yujin, et al.
Published: (2024)
VideoDistill: Language-aware Vision Distillation for Video Question Answering
by: Zou, Bo, et al.
Published: (2024)
by: Zou, Bo, et al.
Published: (2024)
Uni-ISP: Toward Unifying the Learning of ISPs from Multiple Mobile Cameras
by: Li, Lingen, et al.
Published: (2024)
by: Li, Lingen, et al.
Published: (2024)
DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment
by: Cai, Xin, et al.
Published: (2026)
by: Cai, Xin, et al.
Published: (2026)
EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion
by: Huang, Zehuan, et al.
Published: (2023)
by: Huang, Zehuan, et al.
Published: (2023)
From Gallery to Wrist: Realistic 3D Bracelet Insertion in Videos
by: Gao, Chenjian, et al.
Published: (2025)
by: Gao, Chenjian, et al.
Published: (2025)
Weakly Supervised Temporal Sentence Grounding via Positive Sample Mining
by: Dong, Lu, et al.
Published: (2025)
by: Dong, Lu, et al.
Published: (2025)
DiffIER: Optimizing Diffusion Models with Iterative Error Reduction
by: Chen, Ao, et al.
Published: (2025)
by: Chen, Ao, et al.
Published: (2025)
Follow-Your-Creation: Empowering 4D Creation through Video Inpainting
by: Ma, Yue, et al.
Published: (2025)
by: Ma, Yue, et al.
Published: (2025)
Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers
by: Ma, Xin, et al.
Published: (2025)
by: Ma, Xin, et al.
Published: (2025)
Cascading Refinement Video Denoising with Uncertainty Adaptivity
by: Yu, Xinyuan
Published: (2024)
by: Yu, Xinyuan
Published: (2024)
Uni-Animator: Towards Unified Visual Colorization
by: Chen, Xinyuan, et al.
Published: (2026)
by: Chen, Xinyuan, et al.
Published: (2026)
Bilateral Guided Radiance Field Processing
by: Wang, Yuehao, et al.
Published: (2024)
by: Wang, Yuehao, et al.
Published: (2024)
Harvest Video Foundation Models via Efficient Post-Pretraining
by: Li, Yizhuo, et al.
Published: (2023)
by: Li, Yizhuo, et al.
Published: (2023)
Similar Items
-
PARE: Pruning and Adaptive Routing for Efficient Video Generation
by: Wang, Yutong, et al.
Published: (2026) -
AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
by: Zhang, Haiyu, et al.
Published: (2025) -
4Diffusion: Multi-view Video Diffusion Model for 4D Generation
by: Zhang, Haiyu, et al.
Published: (2024) -
GenHOI: Generalizing Text-driven 4D Human-Object Interaction Synthesis for Unseen Objects
by: Li, Shujia, et al.
Published: (2025) -
ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions
by: Wu, Xiaoxue, et al.
Published: (2025)