:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Yang, Zhang, Zhiyong
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Multimedia
Online Access:	https://arxiv.org/abs/2604.21712
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation
by: Yang, Haibo, et al.
Published: (2024)

Deep learning for 3D human pose estimation and mesh recovery: A survey
by: Liu, Yang, et al.
Published: (2024)

AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation
by: Chen, Liyang, et al.
Published: (2023)

DiffMesh: A Motion-aware Diffusion Framework for Human Mesh Recovery from Videos
by: Zheng, Ce, et al.
Published: (2023)

Human Motion Video Generation: A Survey
by: Xue, Haiwei, et al.
Published: (2025)

Unified Generative and Discriminative Training for Multi-modal Large Language Models
by: Chow, Wei, et al.
Published: (2024)

3D2M Dataset: A 3-Dimension diverse Mesh Dataset
by: Dasgupta, Sankarshan
Published: (2024)

Contrastive Multi-Modal Hypergraph Reasoning for 3D Crowd Mesh Recovery
by: Sun, Minghao, et al.
Published: (2026)

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
by: Chen, Liyang, et al.
Published: (2025)

InstructHumans: Editing Animated 3D Human Textures with Instructions
by: Zhu, Jiayin, et al.
Published: (2024)

TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing
by: Lionar, Stefan, et al.
Published: (2025)

Robust Mesh Saliency Ground Truth Acquisition in VR via View Cone Sampling and Manifold Diffusion
by: Zheng, Guoquan, et al.
Published: (2026)

MOC-3D: Manifold-Order Consistency for Text-to-3D Generation
by: Fan, Chenyang, et al.
Published: (2026)

Perceptual Crack Detection for Rendered 3D Textured Meshes
by: Sarvestani, Armin Shafiee, et al.
Published: (2024)

VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation
by: Chen, Yang, et al.
Published: (2024)

Retrieving Any Relevant Moments: Benchmark and Models for Generalized Moment Retrieval
by: Ding, Yiming, et al.
Published: (2026)

Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models
by: Yang, Haibo, et al.
Published: (2024)

Programmable-Room: Interactive Textured 3D Room Meshes Generation Empowered by Large Language Models
by: Kim, Jihyun, et al.
Published: (2025)

Radio Frequency Signal based Human Silhouette Segmentation: A Sequential Diffusion Approach
by: Wen, Penghui, et al.
Published: (2024)

Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition
by: Zhang, Junzheng, et al.
Published: (2024)

StableDub: Taming Diffusion Prior for Generalized and Efficient Visual Dubbing
by: Chen, Liyang, et al.
Published: (2025)

Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision
by: Yin, Kangsheng, et al.
Published: (2025)

MS2Mesh-XR: Multi-modal Sketch-to-Mesh Generation in XR Environments
by: Tong, Yuqi, et al.
Published: (2024)

StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
by: Huang, Yiheng, et al.
Published: (2024)

GeoLink: A 3D-Aware Framework Towards Better Generalization in Cross-View Geo-Localization
by: Zhang, Hongyang, et al.
Published: (2026)

DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
by: Li, Yinqi, et al.
Published: (2025)

On the Robustness of Human-Object Interaction Detection against Distribution Shift
by: Xie, Chi, et al.
Published: (2025)

HDiffTG: A Lightweight Hybrid Diffusion-Transformer-GCN Architecture for 3D Human Pose Estimation
by: Fu, Yajie, et al.
Published: (2025)

BiTDiff: Fine-Grained 3D Conducting Motion Generation via BiMamba-Transformer Diffusion
by: Jia, Tianzhi, et al.
Published: (2026)

SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer
by: Zhu, Rui, et al.
Published: (2024)

Discriminative Probing and Tuning for Text-to-Image Generation
by: Qu, Leigang, et al.
Published: (2024)

HybridMQA: Exploring Geometry-Texture Interactions for Colored Mesh Quality Assessment
by: Sarvestani, Armin Shafiee, et al.
Published: (2024)

REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints
by: Wu, Di, et al.
Published: (2025)

SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting
by: Li, Wenrui, et al.
Published: (2024)

3D Gaussian Editing with A Single Image
by: Luo, Guan, et al.
Published: (2024)

MSVBench: Towards Human-Level Evaluation of Multi-Shot Video Generation
by: Shi, Haoyuan, et al.
Published: (2026)

Towards Robust and Realible Multimodal Misinformation Recognition with Incomplete Modality
by: Zhou, Hengyang, et al.
Published: (2025)

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation
by: Lin, Jiantao, et al.
Published: (2025)

MEDTalk: Multimodal Controlled 3D Facial Animation with Dynamic Emotions by Disentangled Embedding
by: Liu, Chang, et al.
Published: (2025)

3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding
by: Li, Zeju, et al.
Published: (2024)