Saved in:
| Main Authors: | Yan, Sheng, Wang, Yong, Du, Xin, Yuan, Junsong, Liu, Mengyuan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.08337 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions
by: Yan, Sheng, et al.
Published: (2024)
by: Yan, Sheng, et al.
Published: (2024)
MoSa: Motion Generation with Scalable Autoregressive Modeling
by: Liu, Mengyuan, et al.
Published: (2025)
by: Liu, Mengyuan, et al.
Published: (2025)
Cross-Modal Retrieval for Motion and Text via DropTriple Loss
by: Yan, Sheng, et al.
Published: (2023)
by: Yan, Sheng, et al.
Published: (2023)
PoseMoE: Mixture-of-Experts Network for Monocular 3D Human Pose Estimation
by: Liu, Mengyuan, et al.
Published: (2025)
by: Liu, Mengyuan, et al.
Published: (2025)
Lens Privacy Sealing: A New Benchmark and Method for Physical Privacy-Preserving Action Recognition
by: Liu, Mengyuan, et al.
Published: (2026)
by: Liu, Mengyuan, et al.
Published: (2026)
LaxMotion: Rethinking Supervision Granularity for 3D Human Motion Generation
by: Liu, Sheng, et al.
Published: (2025)
by: Liu, Sheng, et al.
Published: (2025)
PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation
by: Zhao, Sihan, et al.
Published: (2025)
by: Zhao, Sihan, et al.
Published: (2025)
Superman: Unifying Skeleton and Vision for Human Motion Perception and Generation
by: Wang, Xinshun, et al.
Published: (2026)
by: Wang, Xinshun, et al.
Published: (2026)
HumanDiT: Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation
by: Gan, Qijun, et al.
Published: (2025)
by: Gan, Qijun, et al.
Published: (2025)
Recognizing Actions from Robotic View for Natural Human-Robot Interaction
by: Wang, Ziyi, et al.
Published: (2025)
by: Wang, Ziyi, et al.
Published: (2025)
Eye Motion Matters for 3D Face Reconstruction
by: Wang, Xuan, et al.
Published: (2024)
by: Wang, Xuan, et al.
Published: (2024)
TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection
by: Yu, Zifan, et al.
Published: (2023)
by: Yu, Zifan, et al.
Published: (2023)
Diffusion-based Pose Refinement and Muti-hypothesis Generation for 3D Human Pose Estimaiton
by: Kang, Hongbo, et al.
Published: (2024)
by: Kang, Hongbo, et al.
Published: (2024)
Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation
by: Li, Wenhao, et al.
Published: (2023)
by: Li, Wenhao, et al.
Published: (2023)
EchoMotion: Unified Human Video and Motion Generation via Dual-Modality Diffusion Transformer
by: Yang, Yuxiao, et al.
Published: (2025)
by: Yang, Yuxiao, et al.
Published: (2025)
Expressive Forecasting of 3D Whole-body Human Motions
by: Ding, Pengxiang, et al.
Published: (2023)
by: Ding, Pengxiang, et al.
Published: (2023)
UST-SSM: Unified Spatio-Temporal State Space Models for Point Cloud Video Modeling
by: Li, Peiming, et al.
Published: (2025)
by: Li, Peiming, et al.
Published: (2025)
3D Skeleton-Based Action Recognition: A Review
by: Liu, Mengyuan, et al.
Published: (2025)
by: Liu, Mengyuan, et al.
Published: (2025)
Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training
by: Yan, Sheng, et al.
Published: (2024)
by: Yan, Sheng, et al.
Published: (2024)
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
by: Shi, Dachuan, et al.
Published: (2023)
by: Shi, Dachuan, et al.
Published: (2023)
Motion Guided Token Compression for Efficient Masked Video Modeling
by: Feng, Yukun, et al.
Published: (2024)
by: Feng, Yukun, et al.
Published: (2024)
GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
by: Wang, Haonan, et al.
Published: (2024)
by: Wang, Haonan, et al.
Published: (2024)
UniMotion: A Unified Framework for Motion-Text-Vision Understanding and Generation
by: Wang, Ziyi, et al.
Published: (2026)
by: Wang, Ziyi, et al.
Published: (2026)
TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation
by: Wang, Yabiao, et al.
Published: (2024)
by: Wang, Yabiao, et al.
Published: (2024)
SRAM: Shape-Realism Alignment Metric for No Reference 3D Shape Evaluation
by: Liu, Sheng, et al.
Published: (2025)
by: Liu, Sheng, et al.
Published: (2025)
Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation
by: Xu, Jiahao, et al.
Published: (2026)
by: Xu, Jiahao, et al.
Published: (2026)
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
by: Li, Ruineng, et al.
Published: (2025)
by: Li, Ruineng, et al.
Published: (2025)
Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation
by: Yarram, Sudhir, et al.
Published: (2024)
by: Yarram, Sudhir, et al.
Published: (2024)
MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding
by: Wang, Yuan, et al.
Published: (2024)
by: Wang, Yuan, et al.
Published: (2024)
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
by: Zhai, Yuanhao, et al.
Published: (2024)
by: Zhai, Yuanhao, et al.
Published: (2024)
Biomechanics-Guided Residual Approach to Generalizable Human Motion Generation and Estimation
by: Kang, Zixi, et al.
Published: (2025)
by: Kang, Zixi, et al.
Published: (2025)
Uni-Inter: Unifying 3D Human Motion Synthesis Across Diverse Interaction Contexts
by: Liu, Sheng, et al.
Published: (2025)
by: Liu, Sheng, et al.
Published: (2025)
FrankenMotion: Part-level Human Motion Generation and Composition
by: Li, Chuqiao, et al.
Published: (2026)
by: Li, Chuqiao, et al.
Published: (2026)
LumosFlow: Motion-Guided Long Video Generation
by: Chen, Jiahao, et al.
Published: (2025)
by: Chen, Jiahao, et al.
Published: (2025)
Morph: A Motion-free Physics Optimization Framework for Human Motion Generation
by: Li, Zhuo, et al.
Published: (2024)
by: Li, Zhuo, et al.
Published: (2024)
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
by: Jin, Peng, et al.
Published: (2024)
by: Jin, Peng, et al.
Published: (2024)
Human-in-Context: Unified Cross-Domain 3D Human Motion Modeling via In-Context Learning
by: Liu, Mengyuan, et al.
Published: (2025)
by: Liu, Mengyuan, et al.
Published: (2025)
GenM$^3$: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation
by: Shi, Junyu, et al.
Published: (2025)
by: Shi, Junyu, et al.
Published: (2025)
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
by: Wang, Siyuan, et al.
Published: (2025)
by: Wang, Siyuan, et al.
Published: (2025)
Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
by: Yu, Qing, et al.
Published: (2024)
by: Yu, Qing, et al.
Published: (2024)
Similar Items
-
MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions
by: Yan, Sheng, et al.
Published: (2024) -
MoSa: Motion Generation with Scalable Autoregressive Modeling
by: Liu, Mengyuan, et al.
Published: (2025) -
Cross-Modal Retrieval for Motion and Text via DropTriple Loss
by: Yan, Sheng, et al.
Published: (2023) -
PoseMoE: Mixture-of-Experts Network for Monocular 3D Human Pose Estimation
by: Liu, Mengyuan, et al.
Published: (2025) -
Lens Privacy Sealing: A New Benchmark and Method for Physical Privacy-Preserving Action Recognition
by: Liu, Mengyuan, et al.
Published: (2026)