Saved in:
Bibliographic Details
Main Authors: Zhang, Qiang, Xiao, Tong, Habeeb, Haroun, Laich, Larissa, Bouaziz, Sofien, Snape, Patrick, Zhang, Wenjing, Cioffi, Matthew, Zhang, Peizhao, Pidlypenskyi, Pavel, Lin, Winnie, Ma, Luming, Wang, Mengjiao, Li, Kunpeng, Long, Chengjiang, Song, Steven, Prazak, Martin, Sjoholm, Alexander, Deogade, Ajinkya, Lee, Jaebong, Mangas, Julio Delgado, Aubel, Amaury
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.03507
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912893131292672
author Zhang, Qiang
Xiao, Tong
Habeeb, Haroun
Laich, Larissa
Bouaziz, Sofien
Snape, Patrick
Zhang, Wenjing
Cioffi, Matthew
Zhang, Peizhao
Pidlypenskyi, Pavel
Lin, Winnie
Ma, Luming
Wang, Mengjiao
Li, Kunpeng
Long, Chengjiang
Song, Steven
Prazak, Martin
Sjoholm, Alexander
Deogade, Ajinkya
Lee, Jaebong
Mangas, Julio Delgado
Aubel, Amaury
author_facet Zhang, Qiang
Xiao, Tong
Habeeb, Haroun
Laich, Larissa
Bouaziz, Sofien
Snape, Patrick
Zhang, Wenjing
Cioffi, Matthew
Zhang, Peizhao
Pidlypenskyi, Pavel
Lin, Winnie
Ma, Luming
Wang, Mengjiao
Li, Kunpeng
Long, Chengjiang
Song, Steven
Prazak, Martin
Sjoholm, Alexander
Deogade, Ajinkya
Lee, Jaebong
Mangas, Julio Delgado
Aubel, Amaury
contents We present a novel system for real-time tracking of facial expressions using egocentric views captured from a set of infrared cameras embedded in a virtual reality (VR) headset. Our technology facilitates any user to accurately drive the facial expressions of virtual characters in a non-intrusive manner and without the need of a lengthy calibration step. At the core of our system is a distillation based approach to train a machine learning model on heterogeneous data and labels coming form multiple sources, \eg synthetic and real images. As part of our dataset, we collected 18k diverse subjects using a lightweight capture setup consisting of a mobile phone and a custom VR headset with extra cameras. To process this data, we developed a robust differentiable rendering pipeline enabling us to automatically extract facial expression labels. Our system opens up new avenues for communication and expression in virtual environments, with applications in video conferencing, gaming, entertainment, and remote collaboration.
format Preprint
id arxiv_https___arxiv_org_abs_2601_03507
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle REFA: Real-time Egocentric Facial Animations for Virtual Reality
Zhang, Qiang
Xiao, Tong
Habeeb, Haroun
Laich, Larissa
Bouaziz, Sofien
Snape, Patrick
Zhang, Wenjing
Cioffi, Matthew
Zhang, Peizhao
Pidlypenskyi, Pavel
Lin, Winnie
Ma, Luming
Wang, Mengjiao
Li, Kunpeng
Long, Chengjiang
Song, Steven
Prazak, Martin
Sjoholm, Alexander
Deogade, Ajinkya
Lee, Jaebong
Mangas, Julio Delgado
Aubel, Amaury
Computer Vision and Pattern Recognition
We present a novel system for real-time tracking of facial expressions using egocentric views captured from a set of infrared cameras embedded in a virtual reality (VR) headset. Our technology facilitates any user to accurately drive the facial expressions of virtual characters in a non-intrusive manner and without the need of a lengthy calibration step. At the core of our system is a distillation based approach to train a machine learning model on heterogeneous data and labels coming form multiple sources, \eg synthetic and real images. As part of our dataset, we collected 18k diverse subjects using a lightweight capture setup consisting of a mobile phone and a custom VR headset with extra cameras. To process this data, we developed a robust differentiable rendering pipeline enabling us to automatically extract facial expression labels. Our system opens up new avenues for communication and expression in virtual environments, with applications in video conferencing, gaming, entertainment, and remote collaboration.
title REFA: Real-time Egocentric Facial Animations for Virtual Reality
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2601.03507