:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Chang, Chen, Mengting, Huang, Yixuan, Wu, Haoning, Ju, Chen, Xiao, Shuai, Lan, Jinsong, Wang, Yanfeng
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.10523
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
by: Wang, Haicheng, et al.
Published: (2024)

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
by: Chen, Mengting, et al.
Published: (2024)

Cell Variational Information Bottleneck Network
by: Zhai, Zhonghua, et al.
Published: (2024)

Animate-X: Universal Character Image Animation with Enhanced Motion Representation
by: Tan, Shuai, et al.
Published: (2024)

Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models
by: Liu, Chang, et al.
Published: (2023)

MatchTime: Towards Automatic Soccer Game Commentary Generation
by: Rao, Jiayuan, et al.
Published: (2024)

Wave-Particle (Continuous-Discrete) Dualistic Visual Tokenization for Unified Understanding and Generation
by: Chen, Yizhu, et al.
Published: (2025)

SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
by: Wu, Haoning, et al.
Published: (2025)

Animate-X++: Universal Character Image Animation with Dynamic Backgrounds
by: Tan, Shuai, et al.
Published: (2025)

Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos
by: Xu, Zhengze, et al.
Published: (2024)

Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
by: Ju, Chen, et al.
Published: (2024)

Implicit Preference Alignment for Human Image Animation
by: Wang, Yuanzhi, et al.
Published: (2026)

FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization
by: Song, Quanjian, et al.
Published: (2026)

AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment
by: Xu, Yuanfeng, et al.
Published: (2024)

iTryOn: Mastering Interactive Video Virtual Try-On with Spatial-Semantic Guidance
by: Zheng, Jun, et al.
Published: (2026)

AnimateAnywhere: Rouse the Background in Human Image Animation
by: Liu, Xiaoyu, et al.
Published: (2025)

Explore More, Learn Better: Parallel MLLM Embeddings under Mutual Information Minimization
by: Wang, Zhicheng, et al.
Published: (2025)

Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment
by: Chen, Yang, et al.
Published: (2025)

Squeeze Out Tokens from Sample for Finer-Grained Data Governance
by: Lin, Weixiong, et al.
Published: (2025)

LSF-Animation: Label-Free Speech-Driven Facial Animation via Implicit Feature Representation
by: Lu, Xin, et al.
Published: (2025)

KeyframeFace: Language-Driven Facial Animation via Semantic Keyframes
by: Wu, Jingchao, et al.
Published: (2025)

EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation
by: Qu, Qiang, et al.
Published: (2025)

MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning
by: Wu, Haoning, et al.
Published: (2024)

The Semantic Lifecycle in Embodied AI: Acquisition, Representation and Storage via Foundation Models
by: Chen, Shuai, et al.
Published: (2026)

Count Anything at Any Granularity
by: Liu, Chang, et al.
Published: (2026)

LINR Bridge: Vector Graphic Animation via Neural Implicits and Video Diffusion Priors
by: Gao, Wenshuo, et al.
Published: (2025)

X-Dyna: Expressive Dynamic Human Image Animation
by: Chang, Di, et al.
Published: (2025)

HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation
by: Wang, Zhenzhi, et al.
Published: (2024)

Multi-identity Human Image Animation with Structural Video Diffusion
by: Wang, Zhenzhi, et al.
Published: (2025)

LEO: Generative Latent Image Animator for Human Video Synthesis
by: Wang, Yaohui, et al.
Published: (2023)

Rethinking the Evaluation of Visible and Infrared Image Fusion
by: Guan, Dayan, et al.
Published: (2024)

One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
by: Shi, Shijun, et al.
Published: (2025)

EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration
by: Li, Wuyang, et al.
Published: (2026)

DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
by: Cheng, Haozhe, et al.
Published: (2024)

EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models
by: Zhang, Yixuan, et al.
Published: (2025)

Disco4D: Disentangled 4D Human Generation and Animation from a Single Image
by: Pang, Hui En, et al.
Published: (2024)

AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation
by: Ma, Chaofan, et al.
Published: (2023)

Contrast-Unity for Partially-Supervised Temporal Sentence Grounding
by: Wang, Haicheng, et al.
Published: (2025)

High-Fidelity and Long-Duration Human Image Animation with Diffusion Transformer
by: Zheng, Shen, et al.
Published: (2025)

StableAnimator: High-Quality Identity-Preserving Human Image Animation
by: Tu, Shuyuan, et al.
Published: (2024)