Saved in:
Bibliographic Details
Main Authors: Schmidt, Sebastian, Körner, Julius, Fuchsgruber, Dominik, Gasperini, Stefano, Tombari, Federico, Günnemann, Stephan
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.04841
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915427769122816
author Schmidt, Sebastian
Körner, Julius
Fuchsgruber, Dominik
Gasperini, Stefano
Tombari, Federico
Günnemann, Stephan
author_facet Schmidt, Sebastian
Körner, Julius
Fuchsgruber, Dominik
Gasperini, Stefano
Tombari, Federico
Günnemann, Stephan
contents In panoptic segmentation, individual instances must be separated within semantic classes. As state-of-the-art methods rely on a pre-defined set of classes, they struggle with novel categories and out-of-distribution (OOD) data. This is particularly problematic in safety-critical applications, such as autonomous driving, where reliability in unseen scenarios is essential. We address the gap between outstanding benchmark performance and reliability by proposing Prior2Former (P2F), the first approach for segmentation vision transformers rooted in evidential learning. P2F extends the mask vision transformer architecture by incorporating a Beta prior for computing model uncertainty in pixel-wise binary mask assignments. This design enables high-quality uncertainty estimation that effectively detects novel and OOD objects enabling state-of-the-art anomaly instance segmentation and open-world panoptic segmentation. Unlike most segmentation models addressing unknown classes, P2F operates without access to OOD data samples or contrastive training on void (i.e., unlabeled) classes, making it highly applicable in real-world scenarios where such prior information is unavailable. Additionally, P2F can be flexibly applied to anomaly instance and panoptic segmentation. Through comprehensive experiments on the Cityscapes, COCO, SegmentMeIfYouCan, and OoDIS datasets, P2F demonstrates state-of-the-art performance across the board.
format Preprint
id arxiv_https___arxiv_org_abs_2504_04841
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Prior2Former -- Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation
Schmidt, Sebastian
Körner, Julius
Fuchsgruber, Dominik
Gasperini, Stefano
Tombari, Federico
Günnemann, Stephan
Computer Vision and Pattern Recognition
In panoptic segmentation, individual instances must be separated within semantic classes. As state-of-the-art methods rely on a pre-defined set of classes, they struggle with novel categories and out-of-distribution (OOD) data. This is particularly problematic in safety-critical applications, such as autonomous driving, where reliability in unseen scenarios is essential. We address the gap between outstanding benchmark performance and reliability by proposing Prior2Former (P2F), the first approach for segmentation vision transformers rooted in evidential learning. P2F extends the mask vision transformer architecture by incorporating a Beta prior for computing model uncertainty in pixel-wise binary mask assignments. This design enables high-quality uncertainty estimation that effectively detects novel and OOD objects enabling state-of-the-art anomaly instance segmentation and open-world panoptic segmentation. Unlike most segmentation models addressing unknown classes, P2F operates without access to OOD data samples or contrastive training on void (i.e., unlabeled) classes, making it highly applicable in real-world scenarios where such prior information is unavailable. Additionally, P2F can be flexibly applied to anomaly instance and panoptic segmentation. Through comprehensive experiments on the Cityscapes, COCO, SegmentMeIfYouCan, and OoDIS datasets, P2F demonstrates state-of-the-art performance across the board.
title Prior2Former -- Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2504.04841