:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yan, Sheng, Wang, Yong, Du, Xin, Yuan, Junsong, Liu, Mengyuan
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.08337
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions
by: Yan, Sheng, et al.
Published: (2024)

MoSa: Motion Generation with Scalable Autoregressive Modeling
by: Liu, Mengyuan, et al.
Published: (2025)

Cross-Modal Retrieval for Motion and Text via DropTriple Loss
by: Yan, Sheng, et al.
Published: (2023)

PoseMoE: Mixture-of-Experts Network for Monocular 3D Human Pose Estimation
by: Liu, Mengyuan, et al.
Published: (2025)

Lens Privacy Sealing: A New Benchmark and Method for Physical Privacy-Preserving Action Recognition
by: Liu, Mengyuan, et al.
Published: (2026)

LaxMotion: Rethinking Supervision Granularity for 3D Human Motion Generation
by: Liu, Sheng, et al.
Published: (2025)

PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation
by: Zhao, Sihan, et al.
Published: (2025)

Superman: Unifying Skeleton and Vision for Human Motion Perception and Generation
by: Wang, Xinshun, et al.
Published: (2026)

HumanDiT: Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation
by: Gan, Qijun, et al.
Published: (2025)

Recognizing Actions from Robotic View for Natural Human-Robot Interaction
by: Wang, Ziyi, et al.
Published: (2025)

Eye Motion Matters for 3D Face Reconstruction
by: Wang, Xuan, et al.
Published: (2024)

TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection
by: Yu, Zifan, et al.
Published: (2023)

Diffusion-based Pose Refinement and Muti-hypothesis Generation for 3D Human Pose Estimaiton
by: Kang, Hongbo, et al.
Published: (2024)

Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation
by: Li, Wenhao, et al.
Published: (2023)

EchoMotion: Unified Human Video and Motion Generation via Dual-Modality Diffusion Transformer
by: Yang, Yuxiao, et al.
Published: (2025)

Expressive Forecasting of 3D Whole-body Human Motions
by: Ding, Pengxiang, et al.
Published: (2023)

UST-SSM: Unified Spatio-Temporal State Space Models for Point Cloud Video Modeling
by: Li, Peiming, et al.
Published: (2025)

3D Skeleton-Based Action Recognition: A Review
by: Liu, Mengyuan, et al.
Published: (2025)

Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training
by: Yan, Sheng, et al.
Published: (2024)

CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
by: Shi, Dachuan, et al.
Published: (2023)

Motion Guided Token Compression for Efficient Masked Video Modeling
by: Feng, Yukun, et al.
Published: (2024)

GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
by: Wang, Haonan, et al.
Published: (2024)

UniMotion: A Unified Framework for Motion-Text-Vision Understanding and Generation
by: Wang, Ziyi, et al.
Published: (2026)

TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation
by: Wang, Yabiao, et al.
Published: (2024)

SRAM: Shape-Realism Alignment Metric for No Reference 3D Shape Evaluation
by: Liu, Sheng, et al.
Published: (2025)

Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation
by: Xu, Jiahao, et al.
Published: (2026)

TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
by: Li, Ruineng, et al.
Published: (2025)

Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation
by: Yarram, Sudhir, et al.
Published: (2024)

MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding
by: Wang, Yuan, et al.
Published: (2024)

Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
by: Zhai, Yuanhao, et al.
Published: (2024)

Biomechanics-Guided Residual Approach to Generalizable Human Motion Generation and Estimation
by: Kang, Zixi, et al.
Published: (2025)

Uni-Inter: Unifying 3D Human Motion Synthesis Across Diverse Interaction Contexts
by: Liu, Sheng, et al.
Published: (2025)

FrankenMotion: Part-level Human Motion Generation and Composition
by: Li, Chuqiao, et al.
Published: (2026)

LumosFlow: Motion-Guided Long Video Generation
by: Chen, Jiahao, et al.
Published: (2025)

Morph: A Motion-free Physics Optimization Framework for Human Motion Generation
by: Li, Zhuo, et al.
Published: (2024)

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
by: Jin, Peng, et al.
Published: (2024)

Human-in-Context: Unified Cross-Domain 3D Human Motion Modeling via In-Context Learning
by: Liu, Mengyuan, et al.
Published: (2025)

GenM$^3$: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation
by: Shi, Junyu, et al.
Published: (2025)

MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
by: Wang, Siyuan, et al.
Published: (2025)

Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
by: Yu, Qing, et al.
Published: (2024)