:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Xuewei, Chen, Zhimin, Song, Yiren
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2503.17934
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DiffSim: Taming Diffusion Models for Evaluating Visual Similarity
by: Song, Yiren, et al.
Published: (2024)

Alfie: Democratising RGBA Image Generation With No $$$
by: Quattrini, Fabio, et al.
Published: (2024)

LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer
by: Song, Yiren, et al.
Published: (2025)

UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation
by: Wang, Xiang, et al.
Published: (2024)

InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation
by: Lin, Yukang, et al.
Published: (2025)

OmniPSD: Layered PSD Generation with Diffusion Transformer
by: Liu, Cheng, et al.
Published: (2025)

Mitty: Diffusion-based Human-to-Robot Video Generation
by: Song, Yiren, et al.
Published: (2025)

EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition
by: Hu, Yihan, et al.
Published: (2025)

Generating Compositional Scenes via Text-to-image RGBA Instance Generation
by: Fontanella, Alessandro, et al.
Published: (2024)

Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
by: Xiao, Steven, et al.
Published: (2025)

Loom: Diffusion-Transformer for Interleaved Generation
by: Ye, Mingcheng, et al.
Published: (2025)

SketchAnimator: Animate Sketch via Motion Customization of Text-to-Video Diffusion Models
by: Yang, Ruolin, et al.
Published: (2025)

VISTA: Triplet-Supervised Video Style Transfer with Diffusion Transformers
by: Song, Yiren, et al.
Published: (2026)

TACO: Taming Diffusion for in-the-wild Video Amodal Completion
by: Lu, Ruijie, et al.
Published: (2025)

X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale
by: Yang, Pei, et al.
Published: (2025)

LayerAnimate: Layer-level Control for Animation
by: Yang, Yuxue, et al.
Published: (2025)

MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation
by: Song, Yiren, et al.
Published: (2025)

Taming Consistency Distillation for Accelerated Human Image Animation
by: Wang, Xiang, et al.
Published: (2025)

AnimateAnything: Consistent and Controllable Animation for Video Generation
by: Lei, Guojun, et al.
Published: (2024)

REL-SF4PASS: Panoramic Semantic Segmentation with REL Depth Representation and Spherical Fusion
by: Li, Xuewei, et al.
Published: (2026)

Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models
by: Millán, Marc Benedí San, et al.
Published: (2025)

Animated Stickers: Bringing Stickers to Life with Video Diffusion
by: Yan, David, et al.
Published: (2024)

WorldWander: Bridging Egocentric and Exocentric Worlds in Video Generation
by: Song, Quanjian, et al.
Published: (2025)

LEO: Generative Latent Image Animator for Human Video Synthesis
by: Wang, Yaohui, et al.
Published: (2023)

Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion
by: Chen, Ting-Hsuan, et al.
Published: (2026)

AnimationBench: Are Video Models Good at Character-Centric Animation?
by: Wu, Leyi, et al.
Published: (2026)

StreamingEffect: Real-Time Human-Centric Video Effect Generation
by: Song, Yiren, et al.
Published: (2026)

AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment
by: Xu, Yuanfeng, et al.
Published: (2024)

Human Gaussian Splatting: Real-time Rendering of Animatable Avatars
by: Moreau, Arthur, et al.
Published: (2023)

UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation
by: Sun, Yang-Tian, et al.
Published: (2025)

InstaVSR: Taming Diffusion for Efficient and Temporally Consistent Video Super-Resolution
by: Hu, Jintong, et al.
Published: (2026)

Real Face Video Animation Platform
by: Chen, Xiaokai, et al.
Published: (2024)

AlphaVAE: Unified End-to-End RGBA Image Reconstruction and Generation with Alpha-Aware Representation Learning
by: Wang, Zile, et al.
Published: (2025)

CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models
by: Wu, Xiaoxue, et al.
Published: (2025)

EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation
by: Qu, Qiang, et al.
Published: (2025)

SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers
by: Qiu, Di, et al.
Published: (2025)

BAGS: Building Animatable Gaussian Splatting from a Monocular Video with Diffusion Priors
by: Zhang, Tingyang, et al.
Published: (2024)

HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation
by: Zhou, Haiyang, et al.
Published: (2025)

An Animation-based Augmentation Approach for Action Recognition from Discontinuous Video
by: Song, Xingyu, et al.
Published: (2024)

Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
by: Jiang, Yanqin, et al.
Published: (2024)