Saved in:
Bibliographic Details
Main Authors: Petrou, Christos, Partaourides, Harris, Balomenos, Athanasios, Kopsinis, Yannis, Chatzis, Sotirios
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.18372
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918305476902912
author Petrou, Christos
Partaourides, Harris
Balomenos, Athanasios
Kopsinis, Yannis
Chatzis, Sotirios
author_facet Petrou, Christos
Partaourides, Harris
Balomenos, Athanasios
Kopsinis, Yannis
Chatzis, Sotirios
contents Gaze prediction plays a critical role in Virtual Reality (VR) applications by reducing sensor-induced latency and enabling computationally demanding techniques such as foveated rendering, which rely on anticipating user attention. However, direct eye tracking is often unavailable due to hardware limitations or privacy concerns. To address this, we present a novel gaze prediction framework that combines Head-Mounted Display (HMD) motion signals with visual saliency cues derived from video frames. Our method employs UniSal, a lightweight saliency encoder, to extract visual features, which are then fused with HMD motion data and processed through a time-series prediction module. We evaluate two lightweight architectures, TSMixer and LSTM, for forecasting future gaze directions. Experiments on the EHTask dataset, along with deployment on commercial VR hardware, show that our approach consistently outperforms baselines such as Center-of-HMD and Mean Gaze. These results demonstrate the effectiveness of predictive gaze modeling in reducing perceptual lag and enhancing natural interaction in VR environments where direct eye tracking is constrained.
format Preprint
id arxiv_https___arxiv_org_abs_2601_18372
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Gaze Prediction in Virtual Reality Without Eye Tracking Using Visual and Head Motion Cues
Petrou, Christos
Partaourides, Harris
Balomenos, Athanasios
Kopsinis, Yannis
Chatzis, Sotirios
Computer Vision and Pattern Recognition
Gaze prediction plays a critical role in Virtual Reality (VR) applications by reducing sensor-induced latency and enabling computationally demanding techniques such as foveated rendering, which rely on anticipating user attention. However, direct eye tracking is often unavailable due to hardware limitations or privacy concerns. To address this, we present a novel gaze prediction framework that combines Head-Mounted Display (HMD) motion signals with visual saliency cues derived from video frames. Our method employs UniSal, a lightweight saliency encoder, to extract visual features, which are then fused with HMD motion data and processed through a time-series prediction module. We evaluate two lightweight architectures, TSMixer and LSTM, for forecasting future gaze directions. Experiments on the EHTask dataset, along with deployment on commercial VR hardware, show that our approach consistently outperforms baselines such as Center-of-HMD and Mean Gaze. These results demonstrate the effectiveness of predictive gaze modeling in reducing perceptual lag and enhancing natural interaction in VR environments where direct eye tracking is constrained.
title Gaze Prediction in Virtual Reality Without Eye Tracking Using Visual and Head Motion Cues
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2601.18372