Saved in:
Bibliographic Details
Main Authors: Vellenga, Koen, Steinhauer, H. Joe, Andersson, Jonas, Sjögren, Anders
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.05006
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911194109968384
author Vellenga, Koen
Steinhauer, H. Joe
Andersson, Jonas
Sjögren, Anders
author_facet Vellenga, Koen
Steinhauer, H. Joe
Andersson, Jonas
Sjögren, Anders
contents Deep neural networks (DNNs) are increasingly applied to safety-critical tasks in resource-constrained environments, such as video-based driver action and intention recognition. While last layer probabilistic deep learning (LL-PDL) methods can detect out-of-distribution (OOD) instances, their performance varies. As an alternative to last layer approaches, we propose extending pre-trained DNNs with transformation layers to produce multiple latent representations to estimate the uncertainty. We evaluate our latent uncertainty representation (LUR) and repulsively trained LUR (RLUR) approaches against eight PDL methods across four video-based driver action and intention recognition datasets, comparing classification performance, calibration, and uncertainty-based OOD detection. We also contribute 28,000 frame-level action labels and 1,194 video-level intention labels for the NuScenes dataset. Our results show that LUR and RLUR achieve comparable in-distribution classification performance to other LL-PDL approaches. For uncertainty-based OOD detection, LUR matches top-performing PDL methods while being more efficient to train and easier to tune than approaches that require Markov-Chain Monte Carlo sampling or repulsive training procedures.
format Preprint
id arxiv_https___arxiv_org_abs_2510_05006
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Latent Uncertainty Representations for Video-based Driver Action and Intention Recognition
Vellenga, Koen
Steinhauer, H. Joe
Andersson, Jonas
Sjögren, Anders
Computer Vision and Pattern Recognition
Machine Learning
Deep neural networks (DNNs) are increasingly applied to safety-critical tasks in resource-constrained environments, such as video-based driver action and intention recognition. While last layer probabilistic deep learning (LL-PDL) methods can detect out-of-distribution (OOD) instances, their performance varies. As an alternative to last layer approaches, we propose extending pre-trained DNNs with transformation layers to produce multiple latent representations to estimate the uncertainty. We evaluate our latent uncertainty representation (LUR) and repulsively trained LUR (RLUR) approaches against eight PDL methods across four video-based driver action and intention recognition datasets, comparing classification performance, calibration, and uncertainty-based OOD detection. We also contribute 28,000 frame-level action labels and 1,194 video-level intention labels for the NuScenes dataset. Our results show that LUR and RLUR achieve comparable in-distribution classification performance to other LL-PDL approaches. For uncertainty-based OOD detection, LUR matches top-performing PDL methods while being more efficient to train and easier to tune than approaches that require Markov-Chain Monte Carlo sampling or repulsive training procedures.
title Latent Uncertainty Representations for Video-based Driver Action and Intention Recognition
topic Computer Vision and Pattern Recognition
Machine Learning
url https://arxiv.org/abs/2510.05006