Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rahman, Muhammad Rameez Ur, Giraldo, Jhony H., Spinelli, Indro, Lathuilière, Stéphane, Galasso, Fabio
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2408.09424
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • Event cameras, known for low-latency operation and superior performance in challenging lighting conditions, are suitable for sensitive computer vision tasks such as semantic segmentation in autonomous driving. However, challenges arise due to limited event-based data and the absence of large-scale segmentation benchmarks. Current works are confined to closed-set semantic segmentation, limiting their adaptability to other applications. In this paper, we introduce OVOSE, the first Open-Vocabulary Semantic Segmentation algorithm for Event cameras. OVOSE leverages synthetic event data and knowledge distillation from a pre-trained image-based foundation model to an event-based counterpart, effectively preserving spatial context and transferring open-vocabulary semantic segmentation capabilities. We evaluate the performance of OVOSE on two driving semantic segmentation datasets DDD17, and DSEC-Semantic, comparing it with existing conventional image open-vocabulary models adapted for event-based data. Similarly, we compare OVOSE with state-of-the-art methods designed for closed-set settings in unsupervised domain adaptation for event-based semantic segmentation. OVOSE demonstrates superior performance, showcasing its potential for real-world applications. The code is available at https://github.com/ram95d/OVOSE.