Saved in:
Bibliographic Details
Main Authors: Saakyan, William, Norden, Matthias, Eversmann, Lola, Kirsch, Simon, Lin, Muyu, Guendelman, Simon, Dziobek, Isabel, Drimalla, Hanna
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.21352
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911176419442688
author Saakyan, William
Norden, Matthias
Eversmann, Lola
Kirsch, Simon
Lin, Muyu
Guendelman, Simon
Dziobek, Isabel
Drimalla, Hanna
author_facet Saakyan, William
Norden, Matthias
Eversmann, Lola
Kirsch, Simon
Lin, Muyu
Guendelman, Simon
Dziobek, Isabel
Drimalla, Hanna
contents Due to the complex and resource-intensive nature of diagnosing Autism Spectrum Condition (ASC), several computer-aided diagnostic support methods have been proposed to detect autism by analyzing behavioral cues in patient video data. While these models show promising results on some datasets, they struggle with poor gaze feature performance and lack of real-world generalizability. To tackle these challenges, we analyze a standardized video dataset comprising 168 participants with ASC (46% female) and 157 non-autistic participants (46% female), making it, to our knowledge, the largest and most balanced dataset available. We conduct a multimodal analysis of facial expressions, voice prosody, head motion, heart rate variability (HRV), and gaze behavior. To address the limitations of prior gaze models, we introduce novel statistical descriptors that quantify variability in eye gaze angles, improving gaze-based classification accuracy from 64% to 69% and aligning computational findings with clinical research on gaze aversion in ASC. Using late fusion, we achieve a classification accuracy of 74%, demonstrating the effectiveness of integrating behavioral markers across multiple modalities. Our findings highlight the potential for scalable, video-based screening tools to support autism assessment.
format Preprint
id arxiv_https___arxiv_org_abs_2509_21352
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Improving Autism Detection with Multimodal Behavioral Analysis
Saakyan, William
Norden, Matthias
Eversmann, Lola
Kirsch, Simon
Lin, Muyu
Guendelman, Simon
Dziobek, Isabel
Drimalla, Hanna
Computer Vision and Pattern Recognition
Machine Learning
Due to the complex and resource-intensive nature of diagnosing Autism Spectrum Condition (ASC), several computer-aided diagnostic support methods have been proposed to detect autism by analyzing behavioral cues in patient video data. While these models show promising results on some datasets, they struggle with poor gaze feature performance and lack of real-world generalizability. To tackle these challenges, we analyze a standardized video dataset comprising 168 participants with ASC (46% female) and 157 non-autistic participants (46% female), making it, to our knowledge, the largest and most balanced dataset available. We conduct a multimodal analysis of facial expressions, voice prosody, head motion, heart rate variability (HRV), and gaze behavior. To address the limitations of prior gaze models, we introduce novel statistical descriptors that quantify variability in eye gaze angles, improving gaze-based classification accuracy from 64% to 69% and aligning computational findings with clinical research on gaze aversion in ASC. Using late fusion, we achieve a classification accuracy of 74%, demonstrating the effectiveness of integrating behavioral markers across multiple modalities. Our findings highlight the potential for scalable, video-based screening tools to support autism assessment.
title Improving Autism Detection with Multimodal Behavioral Analysis
topic Computer Vision and Pattern Recognition
Machine Learning
url https://arxiv.org/abs/2509.21352