Saved in:
Bibliographic Details
Main Authors: Kim, Minkyung, Che, Henry, Chandaka, Bhargav, Pramuanpornsatid, Bhumsitt, Yang, Chengyu, Cheng, Sheng, Wang, Xiaofeng, Hovakimyan, Naira, Wang, Shenlong
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.17421
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910229135884288
author Kim, Minkyung
Che, Henry
Chandaka, Bhargav
Pramuanpornsatid, Bhumsitt
Yang, Chengyu
Cheng, Sheng
Wang, Xiaofeng
Hovakimyan, Naira
Wang, Shenlong
author_facet Kim, Minkyung
Che, Henry
Chandaka, Bhargav
Pramuanpornsatid, Bhumsitt
Yang, Chengyu
Cheng, Sheng
Wang, Xiaofeng
Hovakimyan, Naira
Wang, Shenlong
contents Accurate visual state estimation has been a central topic in robotics with a wide range of applications in robot navigation, autonomous driving, and autonomous flight. Recent advances in robot perception have led to significant improvements in the accuracy and robustness of state estimation, yet a fundamental challenge remains in how to quantify and calibrate its precision, i.e., how confident we are in an estimate and whether failures can be detected. This issue is particularly pronounced in visual-inertial odometry (VIO), where the heteroscedastic and multimodal nature of the problem makes uncertainty quantification especially difficult. This paper introduces MUSE (Multimodal Uncertainty Quantification of State Estimation), a novel real-time learning-based framework that leverages the strong and efficient sequential modeling capacity of Mamba to estimate localization uncertainty from multiple asynchronous sensor streams. Experiments on both public and in-house datasets demonstrate that MUSE achieves superior reliability and robustness compared to existing uncertainty quantification methods, and ablation studies justify the benefits of its key design choices.
format Preprint
id arxiv_https___arxiv_org_abs_2605_17421
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle MUSE: Multimodal Uncertainty Quantification of State Estimation
Kim, Minkyung
Che, Henry
Chandaka, Bhargav
Pramuanpornsatid, Bhumsitt
Yang, Chengyu
Cheng, Sheng
Wang, Xiaofeng
Hovakimyan, Naira
Wang, Shenlong
Robotics
Accurate visual state estimation has been a central topic in robotics with a wide range of applications in robot navigation, autonomous driving, and autonomous flight. Recent advances in robot perception have led to significant improvements in the accuracy and robustness of state estimation, yet a fundamental challenge remains in how to quantify and calibrate its precision, i.e., how confident we are in an estimate and whether failures can be detected. This issue is particularly pronounced in visual-inertial odometry (VIO), where the heteroscedastic and multimodal nature of the problem makes uncertainty quantification especially difficult. This paper introduces MUSE (Multimodal Uncertainty Quantification of State Estimation), a novel real-time learning-based framework that leverages the strong and efficient sequential modeling capacity of Mamba to estimate localization uncertainty from multiple asynchronous sensor streams. Experiments on both public and in-house datasets demonstrate that MUSE achieves superior reliability and robustness compared to existing uncertainty quantification methods, and ablation studies justify the benefits of its key design choices.
title MUSE: Multimodal Uncertainty Quantification of State Estimation
topic Robotics
url https://arxiv.org/abs/2605.17421