Saved in:
| Main Authors: | Liu, Chang, Chen, Mengting, Huang, Yixuan, Wu, Haoning, Ju, Chen, Xiao, Shuai, Lan, Jinsong, Wang, Yanfeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.10523 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
by: Wang, Haicheng, et al.
Published: (2024)
by: Wang, Haicheng, et al.
Published: (2024)
Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
by: Chen, Mengting, et al.
Published: (2024)
by: Chen, Mengting, et al.
Published: (2024)
Cell Variational Information Bottleneck Network
by: Zhai, Zhonghua, et al.
Published: (2024)
by: Zhai, Zhonghua, et al.
Published: (2024)
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
by: Tan, Shuai, et al.
Published: (2024)
by: Tan, Shuai, et al.
Published: (2024)
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models
by: Liu, Chang, et al.
Published: (2023)
by: Liu, Chang, et al.
Published: (2023)
MatchTime: Towards Automatic Soccer Game Commentary Generation
by: Rao, Jiayuan, et al.
Published: (2024)
by: Rao, Jiayuan, et al.
Published: (2024)
Wave-Particle (Continuous-Discrete) Dualistic Visual Tokenization for Unified Understanding and Generation
by: Chen, Yizhu, et al.
Published: (2025)
by: Chen, Yizhu, et al.
Published: (2025)
SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
by: Wu, Haoning, et al.
Published: (2025)
by: Wu, Haoning, et al.
Published: (2025)
Animate-X++: Universal Character Image Animation with Dynamic Backgrounds
by: Tan, Shuai, et al.
Published: (2025)
by: Tan, Shuai, et al.
Published: (2025)
Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos
by: Xu, Zhengze, et al.
Published: (2024)
by: Xu, Zhengze, et al.
Published: (2024)
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
by: Ju, Chen, et al.
Published: (2024)
by: Ju, Chen, et al.
Published: (2024)
Implicit Preference Alignment for Human Image Animation
by: Wang, Yuanzhi, et al.
Published: (2026)
by: Wang, Yuanzhi, et al.
Published: (2026)
FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization
by: Song, Quanjian, et al.
Published: (2026)
by: Song, Quanjian, et al.
Published: (2026)
AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment
by: Xu, Yuanfeng, et al.
Published: (2024)
by: Xu, Yuanfeng, et al.
Published: (2024)
iTryOn: Mastering Interactive Video Virtual Try-On with Spatial-Semantic Guidance
by: Zheng, Jun, et al.
Published: (2026)
by: Zheng, Jun, et al.
Published: (2026)
AnimateAnywhere: Rouse the Background in Human Image Animation
by: Liu, Xiaoyu, et al.
Published: (2025)
by: Liu, Xiaoyu, et al.
Published: (2025)
Explore More, Learn Better: Parallel MLLM Embeddings under Mutual Information Minimization
by: Wang, Zhicheng, et al.
Published: (2025)
by: Wang, Zhicheng, et al.
Published: (2025)
Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment
by: Chen, Yang, et al.
Published: (2025)
by: Chen, Yang, et al.
Published: (2025)
Squeeze Out Tokens from Sample for Finer-Grained Data Governance
by: Lin, Weixiong, et al.
Published: (2025)
by: Lin, Weixiong, et al.
Published: (2025)
LSF-Animation: Label-Free Speech-Driven Facial Animation via Implicit Feature Representation
by: Lu, Xin, et al.
Published: (2025)
by: Lu, Xin, et al.
Published: (2025)
KeyframeFace: Language-Driven Facial Animation via Semantic Keyframes
by: Wu, Jingchao, et al.
Published: (2025)
by: Wu, Jingchao, et al.
Published: (2025)
EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation
by: Qu, Qiang, et al.
Published: (2025)
by: Qu, Qiang, et al.
Published: (2025)
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning
by: Wu, Haoning, et al.
Published: (2024)
by: Wu, Haoning, et al.
Published: (2024)
The Semantic Lifecycle in Embodied AI: Acquisition, Representation and Storage via Foundation Models
by: Chen, Shuai, et al.
Published: (2026)
by: Chen, Shuai, et al.
Published: (2026)
Count Anything at Any Granularity
by: Liu, Chang, et al.
Published: (2026)
by: Liu, Chang, et al.
Published: (2026)
LINR Bridge: Vector Graphic Animation via Neural Implicits and Video Diffusion Priors
by: Gao, Wenshuo, et al.
Published: (2025)
by: Gao, Wenshuo, et al.
Published: (2025)
X-Dyna: Expressive Dynamic Human Image Animation
by: Chang, Di, et al.
Published: (2025)
by: Chang, Di, et al.
Published: (2025)
HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation
by: Wang, Zhenzhi, et al.
Published: (2024)
by: Wang, Zhenzhi, et al.
Published: (2024)
Multi-identity Human Image Animation with Structural Video Diffusion
by: Wang, Zhenzhi, et al.
Published: (2025)
by: Wang, Zhenzhi, et al.
Published: (2025)
LEO: Generative Latent Image Animator for Human Video Synthesis
by: Wang, Yaohui, et al.
Published: (2023)
by: Wang, Yaohui, et al.
Published: (2023)
Rethinking the Evaluation of Visible and Infrared Image Fusion
by: Guan, Dayan, et al.
Published: (2024)
by: Guan, Dayan, et al.
Published: (2024)
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
by: Shi, Shijun, et al.
Published: (2025)
by: Shi, Shijun, et al.
Published: (2025)
EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration
by: Li, Wuyang, et al.
Published: (2026)
by: Li, Wuyang, et al.
Published: (2026)
DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
by: Cheng, Haozhe, et al.
Published: (2024)
by: Cheng, Haozhe, et al.
Published: (2024)
EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models
by: Zhang, Yixuan, et al.
Published: (2025)
by: Zhang, Yixuan, et al.
Published: (2025)
Disco4D: Disentangled 4D Human Generation and Animation from a Single Image
by: Pang, Hui En, et al.
Published: (2024)
by: Pang, Hui En, et al.
Published: (2024)
AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation
by: Ma, Chaofan, et al.
Published: (2023)
by: Ma, Chaofan, et al.
Published: (2023)
Contrast-Unity for Partially-Supervised Temporal Sentence Grounding
by: Wang, Haicheng, et al.
Published: (2025)
by: Wang, Haicheng, et al.
Published: (2025)
High-Fidelity and Long-Duration Human Image Animation with Diffusion Transformer
by: Zheng, Shen, et al.
Published: (2025)
by: Zheng, Shen, et al.
Published: (2025)
StableAnimator: High-Quality Identity-Preserving Human Image Animation
by: Tu, Shuyuan, et al.
Published: (2024)
by: Tu, Shuyuan, et al.
Published: (2024)
Similar Items
-
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
by: Wang, Haicheng, et al.
Published: (2024) -
Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment
by: Chen, Mengting, et al.
Published: (2024) -
Cell Variational Information Bottleneck Network
by: Zhai, Zhonghua, et al.
Published: (2024) -
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
by: Tan, Shuai, et al.
Published: (2024) -
Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models
by: Liu, Chang, et al.
Published: (2023)