Saved in:
| Main Authors: | Liu, Yang, Zhang, Zhiyong |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.21712 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation
by: Yang, Haibo, et al.
Published: (2024)
by: Yang, Haibo, et al.
Published: (2024)
Deep learning for 3D human pose estimation and mesh recovery: A survey
by: Liu, Yang, et al.
Published: (2024)
by: Liu, Yang, et al.
Published: (2024)
AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation
by: Chen, Liyang, et al.
Published: (2023)
by: Chen, Liyang, et al.
Published: (2023)
DiffMesh: A Motion-aware Diffusion Framework for Human Mesh Recovery from Videos
by: Zheng, Ce, et al.
Published: (2023)
by: Zheng, Ce, et al.
Published: (2023)
Human Motion Video Generation: A Survey
by: Xue, Haiwei, et al.
Published: (2025)
by: Xue, Haiwei, et al.
Published: (2025)
Unified Generative and Discriminative Training for Multi-modal Large Language Models
by: Chow, Wei, et al.
Published: (2024)
by: Chow, Wei, et al.
Published: (2024)
3D2M Dataset: A 3-Dimension diverse Mesh Dataset
by: Dasgupta, Sankarshan
Published: (2024)
by: Dasgupta, Sankarshan
Published: (2024)
Contrastive Multi-Modal Hypergraph Reasoning for 3D Crowd Mesh Recovery
by: Sun, Minghao, et al.
Published: (2026)
by: Sun, Minghao, et al.
Published: (2026)
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
by: Chen, Liyang, et al.
Published: (2025)
by: Chen, Liyang, et al.
Published: (2025)
InstructHumans: Editing Animated 3D Human Textures with Instructions
by: Zhu, Jiayin, et al.
Published: (2024)
by: Zhu, Jiayin, et al.
Published: (2024)
TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing
by: Lionar, Stefan, et al.
Published: (2025)
by: Lionar, Stefan, et al.
Published: (2025)
Robust Mesh Saliency Ground Truth Acquisition in VR via View Cone Sampling and Manifold Diffusion
by: Zheng, Guoquan, et al.
Published: (2026)
by: Zheng, Guoquan, et al.
Published: (2026)
MOC-3D: Manifold-Order Consistency for Text-to-3D Generation
by: Fan, Chenyang, et al.
Published: (2026)
by: Fan, Chenyang, et al.
Published: (2026)
Perceptual Crack Detection for Rendered 3D Textured Meshes
by: Sarvestani, Armin Shafiee, et al.
Published: (2024)
by: Sarvestani, Armin Shafiee, et al.
Published: (2024)
VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation
by: Chen, Yang, et al.
Published: (2024)
by: Chen, Yang, et al.
Published: (2024)
Retrieving Any Relevant Moments: Benchmark and Models for Generalized Moment Retrieval
by: Ding, Yiming, et al.
Published: (2026)
by: Ding, Yiming, et al.
Published: (2026)
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models
by: Yang, Haibo, et al.
Published: (2024)
by: Yang, Haibo, et al.
Published: (2024)
Programmable-Room: Interactive Textured 3D Room Meshes Generation Empowered by Large Language Models
by: Kim, Jihyun, et al.
Published: (2025)
by: Kim, Jihyun, et al.
Published: (2025)
Radio Frequency Signal based Human Silhouette Segmentation: A Sequential Diffusion Approach
by: Wen, Penghui, et al.
Published: (2024)
by: Wen, Penghui, et al.
Published: (2024)
Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition
by: Zhang, Junzheng, et al.
Published: (2024)
by: Zhang, Junzheng, et al.
Published: (2024)
StableDub: Taming Diffusion Prior for Generalized and Efficient Visual Dubbing
by: Chen, Liyang, et al.
Published: (2025)
by: Chen, Liyang, et al.
Published: (2025)
Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision
by: Yin, Kangsheng, et al.
Published: (2025)
by: Yin, Kangsheng, et al.
Published: (2025)
MS2Mesh-XR: Multi-modal Sketch-to-Mesh Generation in XR Environments
by: Tong, Yuqi, et al.
Published: (2024)
by: Tong, Yuqi, et al.
Published: (2024)
StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
by: Huang, Yiheng, et al.
Published: (2024)
by: Huang, Yiheng, et al.
Published: (2024)
GeoLink: A 3D-Aware Framework Towards Better Generalization in Cross-View Geo-Localization
by: Zhang, Hongyang, et al.
Published: (2026)
by: Zhang, Hongyang, et al.
Published: (2026)
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
by: Li, Yinqi, et al.
Published: (2025)
by: Li, Yinqi, et al.
Published: (2025)
On the Robustness of Human-Object Interaction Detection against Distribution Shift
by: Xie, Chi, et al.
Published: (2025)
by: Xie, Chi, et al.
Published: (2025)
HDiffTG: A Lightweight Hybrid Diffusion-Transformer-GCN Architecture for 3D Human Pose Estimation
by: Fu, Yajie, et al.
Published: (2025)
by: Fu, Yajie, et al.
Published: (2025)
BiTDiff: Fine-Grained 3D Conducting Motion Generation via BiMamba-Transformer Diffusion
by: Jia, Tianzhi, et al.
Published: (2026)
by: Jia, Tianzhi, et al.
Published: (2026)
SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer
by: Zhu, Rui, et al.
Published: (2024)
by: Zhu, Rui, et al.
Published: (2024)
Discriminative Probing and Tuning for Text-to-Image Generation
by: Qu, Leigang, et al.
Published: (2024)
by: Qu, Leigang, et al.
Published: (2024)
HybridMQA: Exploring Geometry-Texture Interactions for Colored Mesh Quality Assessment
by: Sarvestani, Armin Shafiee, et al.
Published: (2024)
by: Sarvestani, Armin Shafiee, et al.
Published: (2024)
REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints
by: Wu, Di, et al.
Published: (2025)
by: Wu, Di, et al.
Published: (2025)
SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting
by: Li, Wenrui, et al.
Published: (2024)
by: Li, Wenrui, et al.
Published: (2024)
3D Gaussian Editing with A Single Image
by: Luo, Guan, et al.
Published: (2024)
by: Luo, Guan, et al.
Published: (2024)
MSVBench: Towards Human-Level Evaluation of Multi-Shot Video Generation
by: Shi, Haoyuan, et al.
Published: (2026)
by: Shi, Haoyuan, et al.
Published: (2026)
Towards Robust and Realible Multimodal Misinformation Recognition with Incomplete Modality
by: Zhou, Hengyang, et al.
Published: (2025)
by: Zhou, Hengyang, et al.
Published: (2025)
Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation
by: Lin, Jiantao, et al.
Published: (2025)
by: Lin, Jiantao, et al.
Published: (2025)
MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding
by: Liu, Chang, et al.
Published: (2025)
by: Liu, Chang, et al.
Published: (2025)
3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding
by: Li, Zeju, et al.
Published: (2024)
by: Li, Zeju, et al.
Published: (2024)
Similar Items
-
DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation
by: Yang, Haibo, et al.
Published: (2024) -
Deep learning for 3D human pose estimation and mesh recovery: A survey
by: Liu, Yang, et al.
Published: (2024) -
AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation
by: Chen, Liyang, et al.
Published: (2023) -
DiffMesh: A Motion-aware Diffusion Framework for Human Mesh Recovery from Videos
by: Zheng, Ce, et al.
Published: (2023) -
Human Motion Video Generation: A Survey
by: Xue, Haiwei, et al.
Published: (2025)