:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wu, Qingxuan, Dou, Zhiyang, Guo, Chuan, Huang, Yiming, Feng, Qiao, Zhou, Bing, Wang, Jian, Liu, Lingjie
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2510.06504
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ModSkill: Physical Character Skill Modularization
by: Huang, Yiming, et al.
Published: (2025)

SnapMoGen: Human Motion Generation from Expressive Texts
by: Guo, Chuan, et al.
Published: (2025)

PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
by: Wang, Chen, et al.
Published: (2025)

Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation
by: Chen, Chuhao, et al.
Published: (2025)

Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation
by: Liu, Shaowei, et al.
Published: (2025)

DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image
by: Wu, Qingxuan, et al.
Published: (2024)

A Survey on Human Interaction Motion Generation
by: Sui, Kewei, et al.
Published: (2025)

SceneMI: Motion In-betweening for Modeling Human-Scene Interactions
by: Hwang, Inwoo, et al.
Published: (2025)

VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation
by: Chen, Zixuan, et al.
Published: (2024)

PhysHMR: Learning Humanoid Control Policies from Vision for Physically Plausible Human Motion Reconstruction
by: Feng, Qiao, et al.
Published: (2025)

TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis
by: Xie, Yu, et al.
Published: (2025)

GaGA: Towards Interactive Global Geolocation Assistant
by: Dou, Zhiyang, et al.
Published: (2024)

Dynamic Realms: 4D Content Analysis, Recovery and Generation with Geometric, Topological and Physical Priors
by: Dou, Zhiyang
Published: (2024)

AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation
by: Pang, Lianyu, et al.
Published: (2024)

Next-Scale Autoregressive Models for Text-to-Motion Generation
by: Zheng, Zhiwei, et al.
Published: (2026)

HandX: Scaling Bimanual Motion and Interaction Generation
by: Zhang, Zimu, et al.
Published: (2026)

FIA-Edit: Frequency-Interactive Attention for Efficient and High-Fidelity Inversion-Free Text-Guided Image Editing
by: Yang, Kaixiang, et al.
Published: (2025)

Disentangled Clothed Avatar Generation from Text Descriptions
by: Wang, Jionghao, et al.
Published: (2023)

DreamText: High Fidelity Scene Text Synthesis
by: Wang, Yibin, et al.
Published: (2024)

Yume-1.5: A Text-Controlled Interactive World Generation Model
by: Mao, Xiaofeng, et al.
Published: (2025)

High Fidelity Text to Image Generation with Contrastive Alignment and Structural Guidance
by: Gao, Danyi
Published: (2025)

Text-Conditioned Diffusion Model for High-Fidelity Korean Font Generation
by: Sami, Abdul, et al.
Published: (2025)

Unleashing Guidance Without Classifiers for Human-Object Interaction Animation
by: Wang, Ziyin, et al.
Published: (2026)

DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM
by: Li, Xuchen, et al.
Published: (2024)

Semi-supervised Text-based Person Search
by: Gao, Daming, et al.
Published: (2024)

Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes
by: Dou, Yiming, et al.
Published: (2025)

HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models
by: Peng, Xiaogang, et al.
Published: (2023)

TextBoost: Boosting Text Encoder for Personalized Text-to-Image Generation
by: Park, NaHyeon, et al.
Published: (2024)

TLControl: Trajectory and Language Control for Human Motion Synthesis
by: Wan, Weilin, et al.
Published: (2023)

InterFusion: Text-Driven Generation of 3D Human-Object Interaction
by: Dai, Sisi, et al.
Published: (2024)

Interactive Visual Assessment for Text-to-Image Generation Models
by: Mi, Xiaoyue, et al.
Published: (2024)

Text-driven Multiplanar Visual Interaction for Semi-supervised Medical Image Segmentation
by: Huang, Kaiwen, et al.
Published: (2025)

DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling
by: Ghosh, Anindita, et al.
Published: (2025)

Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing
by: Xu, Yangyang, et al.
Published: (2024)

CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization
by: Wu, Feize, et al.
Published: (2024)

Counting Guidance for High Fidelity Text-to-Image Synthesis
by: Kang, Wonjun, et al.
Published: (2023)

EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation
by: Zhou, Wenyang, et al.
Published: (2023)

SHYI: Action Support for Contrastive Learning in High-Fidelity Text-to-Image Generation
by: Xia, Tianxiang, et al.
Published: (2025)

THOR: Text to Human-Object Interaction Diffusion via Relation Intervention
by: Wu, Qianyang, et al.
Published: (2024)

Text-guided Feature Disentanglement for Cross-modal Gait Recognition
by: Lu, Zhiyang, et al.
Published: (2026)