Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.30969 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914616069586944 |
|---|---|
| author | Shi, Zhenwu Gong, Jingyu Wang, Peiwei Wang, Xingzan Qian, Tianwen Li, Wenxi Fang, Yuan Xie, Jiao Ma, Lizhuang Lin, Shaohui |
| author_facet | Shi, Zhenwu Gong, Jingyu Wang, Peiwei Wang, Xingzan Qian, Tianwen Li, Wenxi Fang, Yuan Xie, Jiao Ma, Lizhuang Lin, Shaohui |
| contents | Text-based human motion editing aims to modify existing motion sequences according to natural language instructions while maintaining the consistency of the original motion. Existing diffusion-based approaches often rely on heuristic similarity cues or coarse global conditioning, leading to motion distortion and suboptimal semantic alignment. The key challenge lies in balancing change (i.e. precisely editing target regions) and invariance (i.e. preserving unedited parts). To handle such challenge, we propose an Omni-Supervised Positive-Negative Learning framework, named OmniME. Our method integrates three complementary components: (1) retrospective feature supervision that enforces coarse-to-fine consistency across transformer layers,(2) motion preservation mechanism that focuses on subtle variations according to the source-target similarity, and (3) triplet-based semantic alignment that strengthens text-motion correspondence. Together, these components form a unified supervision paradigm that balances change and invariance. Extensive experiments on the MotionFix and STANCE Adjustment datasets demonstrate that OmniME achieves state-of-the-art performance in editing alignment, validating the effectiveness of our unified learning framework. Our source codes and models have been released at: https://github.com/rocket-ycyer/OmniME.git |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2605_30969 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Omni-Supervised Motion Editing: Balancing Change and Invariance through Positive-Negative Learning Shi, Zhenwu Gong, Jingyu Wang, Peiwei Wang, Xingzan Qian, Tianwen Li, Wenxi Fang, Yuan Xie, Jiao Ma, Lizhuang Lin, Shaohui Computer Vision and Pattern Recognition Text-based human motion editing aims to modify existing motion sequences according to natural language instructions while maintaining the consistency of the original motion. Existing diffusion-based approaches often rely on heuristic similarity cues or coarse global conditioning, leading to motion distortion and suboptimal semantic alignment. The key challenge lies in balancing change (i.e. precisely editing target regions) and invariance (i.e. preserving unedited parts). To handle such challenge, we propose an Omni-Supervised Positive-Negative Learning framework, named OmniME. Our method integrates three complementary components: (1) retrospective feature supervision that enforces coarse-to-fine consistency across transformer layers,(2) motion preservation mechanism that focuses on subtle variations according to the source-target similarity, and (3) triplet-based semantic alignment that strengthens text-motion correspondence. Together, these components form a unified supervision paradigm that balances change and invariance. Extensive experiments on the MotionFix and STANCE Adjustment datasets demonstrate that OmniME achieves state-of-the-art performance in editing alignment, validating the effectiveness of our unified learning framework. Our source codes and models have been released at: https://github.com/rocket-ycyer/OmniME.git |
| title | Omni-Supervised Motion Editing: Balancing Change and Invariance through Positive-Negative Learning |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2605.30969 |