Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hou, Xinlong, Shen, Sen, Li, Xueshen, Gao, Xinran, Huang, Ziyi, Holiday, Steven J., Cribbet, Matthew R., White, Susan W., Sazonov, Edward, Gan, Yu
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.01966
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910933818802176
author	Hou, Xinlong Shen, Sen Li, Xueshen Gao, Xinran Huang, Ziyi Holiday, Steven J. Cribbet, Matthew R. White, Susan W. Sazonov, Edward Gan, Yu
author_facet	Hou, Xinlong Shen, Sen Li, Xueshen Gao, Xinran Huang, Ziyi Holiday, Steven J. Cribbet, Matthew R. White, Susan W. Sazonov, Edward Gan, Yu
contents	Being able to accurately monitor the screen exposure of young children is important for research on phenomena linked to screen use such as childhood obesity, physical activity, and social interaction. Most existing studies rely upon self-report or manual measures from bulky wearable sensors, thus lacking efficiency and accuracy in capturing quantitative screen exposure data. In this work, we developed a novel sensor informatics framework that utilizes egocentric images from a wearable sensor, termed the screen time tracker (STT), and a vision language model (VLM). In particular, we devised a multi-view VLM that takes multiple views from egocentric image sequences and interprets screen exposure dynamically. We validated our approach by using a dataset of children's free-living activities, demonstrating significant improvement over existing methods in plain vision language models and object detection models. Results supported the promise of this monitoring approach, which could optimize behavioral research on screen exposure in children's naturalistic settings.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_01966
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Enhancing Screen Time Identification in Children with a Multi-View Vision Language Model and Screen Time Tracker Hou, Xinlong Shen, Sen Li, Xueshen Gao, Xinran Huang, Ziyi Holiday, Steven J. Cribbet, Matthew R. White, Susan W. Sazonov, Edward Gan, Yu Computer Vision and Pattern Recognition Artificial Intelligence Being able to accurately monitor the screen exposure of young children is important for research on phenomena linked to screen use such as childhood obesity, physical activity, and social interaction. Most existing studies rely upon self-report or manual measures from bulky wearable sensors, thus lacking efficiency and accuracy in capturing quantitative screen exposure data. In this work, we developed a novel sensor informatics framework that utilizes egocentric images from a wearable sensor, termed the screen time tracker (STT), and a vision language model (VLM). In particular, we devised a multi-view VLM that takes multiple views from egocentric image sequences and interprets screen exposure dynamically. We validated our approach by using a dataset of children's free-living activities, demonstrating significant improvement over existing methods in plain vision language models and object detection models. Results supported the promise of this monitoring approach, which could optimize behavioral research on screen exposure in children's naturalistic settings.
title	Enhancing Screen Time Identification in Children with a Multi-View Vision Language Model and Screen Time Tracker
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2410.01966

Similar Items