Saved in:
Bibliographic Details
Main Authors: Chen, Yuexi, Morariu, Vlad I., Truong, Anh, Liu, Zhicheng
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.08049
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916157479452672
author Chen, Yuexi
Morariu, Vlad I.
Truong, Anh
Liu, Zhicheng
author_facet Chen, Yuexi
Morariu, Vlad I.
Truong, Anh
Liu, Zhicheng
contents Mixed-media tutorials, which integrate videos, images, text, and diagrams to teach procedural skills, offer more browsable alternatives than timeline-based videos. However, manually creating such tutorials is tedious, and existing automated solutions are often restricted to a particular domain. While AI models hold promise, it is unclear how to effectively harness their powers, given the multi-modal data involved and the vast landscape of models. We present TutoAI, a cross-domain framework for AI-assisted mixed-media tutorial creation on physical tasks. First, we distill common tutorial components by surveying existing work; then, we present an approach to identify, assemble, and evaluate AI models for component extraction; finally, we propose guidelines for designing user interfaces (UI) that support tutorial creation based on AI-generated components. We show that TutoAI has achieved higher or similar quality compared to a baseline model in preliminary user studies.
format Preprint
id arxiv_https___arxiv_org_abs_2403_08049
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle TutoAI: A Cross-domain Framework for AI-assisted Mixed-media Tutorial Creation on Physical Tasks
Chen, Yuexi
Morariu, Vlad I.
Truong, Anh
Liu, Zhicheng
Human-Computer Interaction
Artificial Intelligence
Machine Learning
Mixed-media tutorials, which integrate videos, images, text, and diagrams to teach procedural skills, offer more browsable alternatives than timeline-based videos. However, manually creating such tutorials is tedious, and existing automated solutions are often restricted to a particular domain. While AI models hold promise, it is unclear how to effectively harness their powers, given the multi-modal data involved and the vast landscape of models. We present TutoAI, a cross-domain framework for AI-assisted mixed-media tutorial creation on physical tasks. First, we distill common tutorial components by surveying existing work; then, we present an approach to identify, assemble, and evaluate AI models for component extraction; finally, we propose guidelines for designing user interfaces (UI) that support tutorial creation based on AI-generated components. We show that TutoAI has achieved higher or similar quality compared to a baseline model in preliminary user studies.
title TutoAI: A Cross-domain Framework for AI-assisted Mixed-media Tutorial Creation on Physical Tasks
topic Human-Computer Interaction
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2403.08049