Saved in:
| Main Authors: | Wu, Qingxuan, Dou, Zhiyang, Guo, Chuan, Huang, Yiming, Feng, Qiao, Zhou, Bing, Wang, Jian, Liu, Lingjie |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.06504 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ModSkill: Physical Character Skill Modularization
by: Huang, Yiming, et al.
Published: (2025)
by: Huang, Yiming, et al.
Published: (2025)
SnapMoGen: Human Motion Generation from Expressive Texts
by: Guo, Chuan, et al.
Published: (2025)
by: Guo, Chuan, et al.
Published: (2025)
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
by: Wang, Chen, et al.
Published: (2025)
by: Wang, Chen, et al.
Published: (2025)
Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation
by: Chen, Chuhao, et al.
Published: (2025)
by: Chen, Chuhao, et al.
Published: (2025)
Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation
by: Liu, Shaowei, et al.
Published: (2025)
by: Liu, Shaowei, et al.
Published: (2025)
DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image
by: Wu, Qingxuan, et al.
Published: (2024)
by: Wu, Qingxuan, et al.
Published: (2024)
A Survey on Human Interaction Motion Generation
by: Sui, Kewei, et al.
Published: (2025)
by: Sui, Kewei, et al.
Published: (2025)
SceneMI: Motion In-betweening for Modeling Human-Scene Interactions
by: Hwang, Inwoo, et al.
Published: (2025)
by: Hwang, Inwoo, et al.
Published: (2025)
VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation
by: Chen, Zixuan, et al.
Published: (2024)
by: Chen, Zixuan, et al.
Published: (2024)
PhysHMR: Learning Humanoid Control Policies from Vision for Physically Plausible Human Motion Reconstruction
by: Feng, Qiao, et al.
Published: (2025)
by: Feng, Qiao, et al.
Published: (2025)
TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis
by: Xie, Yu, et al.
Published: (2025)
by: Xie, Yu, et al.
Published: (2025)
GaGA: Towards Interactive Global Geolocation Assistant
by: Dou, Zhiyang, et al.
Published: (2024)
by: Dou, Zhiyang, et al.
Published: (2024)
Dynamic Realms: 4D Content Analysis, Recovery and Generation with Geometric, Topological and Physical Priors
by: Dou, Zhiyang
Published: (2024)
by: Dou, Zhiyang
Published: (2024)
AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation
by: Pang, Lianyu, et al.
Published: (2024)
by: Pang, Lianyu, et al.
Published: (2024)
Next-Scale Autoregressive Models for Text-to-Motion Generation
by: Zheng, Zhiwei, et al.
Published: (2026)
by: Zheng, Zhiwei, et al.
Published: (2026)
HandX: Scaling Bimanual Motion and Interaction Generation
by: Zhang, Zimu, et al.
Published: (2026)
by: Zhang, Zimu, et al.
Published: (2026)
FIA-Edit: Frequency-Interactive Attention for Efficient and High-Fidelity Inversion-Free Text-Guided Image Editing
by: Yang, Kaixiang, et al.
Published: (2025)
by: Yang, Kaixiang, et al.
Published: (2025)
Disentangled Clothed Avatar Generation from Text Descriptions
by: Wang, Jionghao, et al.
Published: (2023)
by: Wang, Jionghao, et al.
Published: (2023)
DreamText: High Fidelity Scene Text Synthesis
by: Wang, Yibin, et al.
Published: (2024)
by: Wang, Yibin, et al.
Published: (2024)
Yume-1.5: A Text-Controlled Interactive World Generation Model
by: Mao, Xiaofeng, et al.
Published: (2025)
by: Mao, Xiaofeng, et al.
Published: (2025)
High Fidelity Text to Image Generation with Contrastive Alignment and Structural Guidance
by: Gao, Danyi
Published: (2025)
by: Gao, Danyi
Published: (2025)
Text-Conditioned Diffusion Model for High-Fidelity Korean Font Generation
by: Sami, Abdul, et al.
Published: (2025)
by: Sami, Abdul, et al.
Published: (2025)
Unleashing Guidance Without Classifiers for Human-Object Interaction Animation
by: Wang, Ziyin, et al.
Published: (2026)
by: Wang, Ziyin, et al.
Published: (2026)
DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM
by: Li, Xuchen, et al.
Published: (2024)
by: Li, Xuchen, et al.
Published: (2024)
Semi-supervised Text-based Person Search
by: Gao, Daming, et al.
Published: (2024)
by: Gao, Daming, et al.
Published: (2024)
Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes
by: Dou, Yiming, et al.
Published: (2025)
by: Dou, Yiming, et al.
Published: (2025)
HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models
by: Peng, Xiaogang, et al.
Published: (2023)
by: Peng, Xiaogang, et al.
Published: (2023)
TextBoost: Boosting Text Encoder for Personalized Text-to-Image Generation
by: Park, NaHyeon, et al.
Published: (2024)
by: Park, NaHyeon, et al.
Published: (2024)
TLControl: Trajectory and Language Control for Human Motion Synthesis
by: Wan, Weilin, et al.
Published: (2023)
by: Wan, Weilin, et al.
Published: (2023)
InterFusion: Text-Driven Generation of 3D Human-Object Interaction
by: Dai, Sisi, et al.
Published: (2024)
by: Dai, Sisi, et al.
Published: (2024)
Interactive Visual Assessment for Text-to-Image Generation Models
by: Mi, Xiaoyue, et al.
Published: (2024)
by: Mi, Xiaoyue, et al.
Published: (2024)
Text-driven Multiplanar Visual Interaction for Semi-supervised Medical Image Segmentation
by: Huang, Kaiwen, et al.
Published: (2025)
by: Huang, Kaiwen, et al.
Published: (2025)
DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling
by: Ghosh, Anindita, et al.
Published: (2025)
by: Ghosh, Anindita, et al.
Published: (2025)
Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing
by: Xu, Yangyang, et al.
Published: (2024)
by: Xu, Yangyang, et al.
Published: (2024)
CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization
by: Wu, Feize, et al.
Published: (2024)
by: Wu, Feize, et al.
Published: (2024)
Counting Guidance for High Fidelity Text-to-Image Synthesis
by: Kang, Wonjun, et al.
Published: (2023)
by: Kang, Wonjun, et al.
Published: (2023)
EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation
by: Zhou, Wenyang, et al.
Published: (2023)
by: Zhou, Wenyang, et al.
Published: (2023)
SHYI: Action Support for Contrastive Learning in High-Fidelity Text-to-Image Generation
by: Xia, Tianxiang, et al.
Published: (2025)
by: Xia, Tianxiang, et al.
Published: (2025)
THOR: Text to Human-Object Interaction Diffusion via Relation Intervention
by: Wu, Qianyang, et al.
Published: (2024)
by: Wu, Qianyang, et al.
Published: (2024)
Text-guided Feature Disentanglement for Cross-modal Gait Recognition
by: Lu, Zhiyang, et al.
Published: (2026)
by: Lu, Zhiyang, et al.
Published: (2026)
Similar Items
-
ModSkill: Physical Character Skill Modularization
by: Huang, Yiming, et al.
Published: (2025) -
SnapMoGen: Human Motion Generation from Expressive Texts
by: Guo, Chuan, et al.
Published: (2025) -
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
by: Wang, Chen, et al.
Published: (2025) -
Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation
by: Chen, Chuhao, et al.
Published: (2025) -
Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation
by: Liu, Shaowei, et al.
Published: (2025)