Saved in:
Bibliographic Details
Main Authors: Shi, Zhenwu, Gong, Jingyu, Wang, Peiwei, Wang, Xingzan, Qian, Tianwen, Li, Wenxi, Fang, Yuan, Xie, Jiao, Ma, Lizhuang, Lin, Shaohui
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.30969
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914616069586944
author Shi, Zhenwu
Gong, Jingyu
Wang, Peiwei
Wang, Xingzan
Qian, Tianwen
Li, Wenxi
Fang, Yuan
Xie, Jiao
Ma, Lizhuang
Lin, Shaohui
author_facet Shi, Zhenwu
Gong, Jingyu
Wang, Peiwei
Wang, Xingzan
Qian, Tianwen
Li, Wenxi
Fang, Yuan
Xie, Jiao
Ma, Lizhuang
Lin, Shaohui
contents Text-based human motion editing aims to modify existing motion sequences according to natural language instructions while maintaining the consistency of the original motion. Existing diffusion-based approaches often rely on heuristic similarity cues or coarse global conditioning, leading to motion distortion and suboptimal semantic alignment. The key challenge lies in balancing change (i.e. precisely editing target regions) and invariance (i.e. preserving unedited parts). To handle such challenge, we propose an Omni-Supervised Positive-Negative Learning framework, named OmniME. Our method integrates three complementary components: (1) retrospective feature supervision that enforces coarse-to-fine consistency across transformer layers,(2) motion preservation mechanism that focuses on subtle variations according to the source-target similarity, and (3) triplet-based semantic alignment that strengthens text-motion correspondence. Together, these components form a unified supervision paradigm that balances change and invariance. Extensive experiments on the MotionFix and STANCE Adjustment datasets demonstrate that OmniME achieves state-of-the-art performance in editing alignment, validating the effectiveness of our unified learning framework. Our source codes and models have been released at: https://github.com/rocket-ycyer/OmniME.git
format Preprint
id arxiv_https___arxiv_org_abs_2605_30969
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Omni-Supervised Motion Editing: Balancing Change and Invariance through Positive-Negative Learning
Shi, Zhenwu
Gong, Jingyu
Wang, Peiwei
Wang, Xingzan
Qian, Tianwen
Li, Wenxi
Fang, Yuan
Xie, Jiao
Ma, Lizhuang
Lin, Shaohui
Computer Vision and Pattern Recognition
Text-based human motion editing aims to modify existing motion sequences according to natural language instructions while maintaining the consistency of the original motion. Existing diffusion-based approaches often rely on heuristic similarity cues or coarse global conditioning, leading to motion distortion and suboptimal semantic alignment. The key challenge lies in balancing change (i.e. precisely editing target regions) and invariance (i.e. preserving unedited parts). To handle such challenge, we propose an Omni-Supervised Positive-Negative Learning framework, named OmniME. Our method integrates three complementary components: (1) retrospective feature supervision that enforces coarse-to-fine consistency across transformer layers,(2) motion preservation mechanism that focuses on subtle variations according to the source-target similarity, and (3) triplet-based semantic alignment that strengthens text-motion correspondence. Together, these components form a unified supervision paradigm that balances change and invariance. Extensive experiments on the MotionFix and STANCE Adjustment datasets demonstrate that OmniME achieves state-of-the-art performance in editing alignment, validating the effectiveness of our unified learning framework. Our source codes and models have been released at: https://github.com/rocket-ycyer/OmniME.git
title Omni-Supervised Motion Editing: Balancing Change and Invariance through Positive-Negative Learning
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2605.30969