Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.08049 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866916157479452672 |
|---|---|
| author | Chen, Yuexi Morariu, Vlad I. Truong, Anh Liu, Zhicheng |
| author_facet | Chen, Yuexi Morariu, Vlad I. Truong, Anh Liu, Zhicheng |
| contents | Mixed-media tutorials, which integrate videos, images, text, and diagrams to teach procedural skills, offer more browsable alternatives than timeline-based videos. However, manually creating such tutorials is tedious, and existing automated solutions are often restricted to a particular domain. While AI models hold promise, it is unclear how to effectively harness their powers, given the multi-modal data involved and the vast landscape of models. We present TutoAI, a cross-domain framework for AI-assisted mixed-media tutorial creation on physical tasks. First, we distill common tutorial components by surveying existing work; then, we present an approach to identify, assemble, and evaluate AI models for component extraction; finally, we propose guidelines for designing user interfaces (UI) that support tutorial creation based on AI-generated components. We show that TutoAI has achieved higher or similar quality compared to a baseline model in preliminary user studies. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2403_08049 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | TutoAI: A Cross-domain Framework for AI-assisted Mixed-media Tutorial Creation on Physical Tasks Chen, Yuexi Morariu, Vlad I. Truong, Anh Liu, Zhicheng Human-Computer Interaction Artificial Intelligence Machine Learning Mixed-media tutorials, which integrate videos, images, text, and diagrams to teach procedural skills, offer more browsable alternatives than timeline-based videos. However, manually creating such tutorials is tedious, and existing automated solutions are often restricted to a particular domain. While AI models hold promise, it is unclear how to effectively harness their powers, given the multi-modal data involved and the vast landscape of models. We present TutoAI, a cross-domain framework for AI-assisted mixed-media tutorial creation on physical tasks. First, we distill common tutorial components by surveying existing work; then, we present an approach to identify, assemble, and evaluate AI models for component extraction; finally, we propose guidelines for designing user interfaces (UI) that support tutorial creation based on AI-generated components. We show that TutoAI has achieved higher or similar quality compared to a baseline model in preliminary user studies. |
| title | TutoAI: A Cross-domain Framework for AI-assisted Mixed-media Tutorial Creation on Physical Tasks |
| topic | Human-Computer Interaction Artificial Intelligence Machine Learning |
| url | https://arxiv.org/abs/2403.08049 |