Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.07501 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914089202089984 |
|---|---|
| author | Zuo, Xiaoye Athanasiou, Nikos Delmas, Ginger Huang, Yiming Fu, Xingyu Liu, Lingjie |
| author_facet | Zuo, Xiaoye Athanasiou, Nikos Delmas, Ginger Huang, Yiming Fu, Xingyu Liu, Lingjie |
| contents | Good form is the difference between strength and strain, yet for the fast-growing community of at-home fitness enthusiasts, expert feedback is often out of reach. FormCoach transforms a simple camera into an always-on, interactive AI training partner, capable of spotting subtle form errors and delivering tailored corrections in real time, leveraging vision-language models (VLMs). We showcase this capability through a web interface and benchmark state-of-the-art VLMs on a dataset of 1,700 expert-annotated user-reference video pairs spanning 22 strength and mobility exercises. To accelerate research in AI-driven coaching, we release both the dataset and an automated, rubric-based evaluation pipeline, enabling standardized comparison across models. Our benchmarks reveal substantial gaps compared to human-level coaching, underscoring both the challenges and opportunities in integrating nuanced, context-aware movement analysis into interactive AI systems. By framing form correction as a collaborative and creative process between humans and machines, FormCoach opens a new frontier in embodied AI. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2508_07501 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | FormCoach: Lift Smarter, Not Harder Zuo, Xiaoye Athanasiou, Nikos Delmas, Ginger Huang, Yiming Fu, Xingyu Liu, Lingjie Computer Vision and Pattern Recognition Human-Computer Interaction Good form is the difference between strength and strain, yet for the fast-growing community of at-home fitness enthusiasts, expert feedback is often out of reach. FormCoach transforms a simple camera into an always-on, interactive AI training partner, capable of spotting subtle form errors and delivering tailored corrections in real time, leveraging vision-language models (VLMs). We showcase this capability through a web interface and benchmark state-of-the-art VLMs on a dataset of 1,700 expert-annotated user-reference video pairs spanning 22 strength and mobility exercises. To accelerate research in AI-driven coaching, we release both the dataset and an automated, rubric-based evaluation pipeline, enabling standardized comparison across models. Our benchmarks reveal substantial gaps compared to human-level coaching, underscoring both the challenges and opportunities in integrating nuanced, context-aware movement analysis into interactive AI systems. By framing form correction as a collaborative and creative process between humans and machines, FormCoach opens a new frontier in embodied AI. |
| title | FormCoach: Lift Smarter, Not Harder |
| topic | Computer Vision and Pattern Recognition Human-Computer Interaction |
| url | https://arxiv.org/abs/2508.07501 |