Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chakraborty, Souradeep, Wei, Zijun, Kelton, Conor, Ahn, Seoyoung, Balasubramanian, Aruna, Zelinsky, Gregory J., Samaras, Dimitris
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2407.02439
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909239477272576
author	Chakraborty, Souradeep Wei, Zijun Kelton, Conor Ahn, Seoyoung Balasubramanian, Aruna Zelinsky, Gregory J. Samaras, Dimitris
author_facet	Chakraborty, Souradeep Wei, Zijun Kelton, Conor Ahn, Seoyoung Balasubramanian, Aruna Zelinsky, Gregory J. Samaras, Dimitris
contents	We present a model for predicting visual attention during the free viewing of graphic design documents. While existing works on this topic have aimed at predicting static saliency of graphic designs, our work is the first attempt to predict both spatial attention and dynamic temporal order in which the document regions are fixated by gaze using a deep learning based model. We propose a two-stage model for predicting dynamic attention on such documents, with webpages being our primary choice of document design for demonstration. In the first stage, we predict the saliency maps for each of the document components (e.g. logos, banners, texts, etc. for webpages) conditioned on the type of document layout. These component saliency maps are then jointly used to predict the overall document saliency. In the second stage, we use these layout-specific component saliency maps as the state representation for an inverse reinforcement learning model of fixation scanpath prediction during document viewing. To test our model, we collected a new dataset consisting of eye movements from 41 people freely viewing 450 webpages (the largest dataset of its kind). Experimental results show that our model outperforms existing models in both saliency and scanpath prediction for webpages, and also generalizes very well to other graphic design documents such as comics, posters, mobile UIs, etc. and natural images.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_02439
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Predicting Visual Attention in Graphic Design Documents Chakraborty, Souradeep Wei, Zijun Kelton, Conor Ahn, Seoyoung Balasubramanian, Aruna Zelinsky, Gregory J. Samaras, Dimitris Computer Vision and Pattern Recognition We present a model for predicting visual attention during the free viewing of graphic design documents. While existing works on this topic have aimed at predicting static saliency of graphic designs, our work is the first attempt to predict both spatial attention and dynamic temporal order in which the document regions are fixated by gaze using a deep learning based model. We propose a two-stage model for predicting dynamic attention on such documents, with webpages being our primary choice of document design for demonstration. In the first stage, we predict the saliency maps for each of the document components (e.g. logos, banners, texts, etc. for webpages) conditioned on the type of document layout. These component saliency maps are then jointly used to predict the overall document saliency. In the second stage, we use these layout-specific component saliency maps as the state representation for an inverse reinforcement learning model of fixation scanpath prediction during document viewing. To test our model, we collected a new dataset consisting of eye movements from 41 people freely viewing 450 webpages (the largest dataset of its kind). Experimental results show that our model outperforms existing models in both saliency and scanpath prediction for webpages, and also generalizes very well to other graphic design documents such as comics, posters, mobile UIs, etc. and natural images.
title	Predicting Visual Attention in Graphic Design Documents
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2407.02439

Similar Items