Saved in:
Bibliographic Details
Main Authors: Titkov, Roman, Zubkov, Egor, Yudin, Dmitry, Mahmoud, Jaafar, Mohrat, Malik, Sidorov, Gennady
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2506.03073
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913872860938240
author Titkov, Roman
Zubkov, Egor
Yudin, Dmitry
Mahmoud, Jaafar
Mohrat, Malik
Sidorov, Gennady
author_facet Titkov, Roman
Zubkov, Egor
Yudin, Dmitry
Mahmoud, Jaafar
Mohrat, Malik
Sidorov, Gennady
contents Modern Gaussian Splatting methods have proven highly effective for real-time photorealistic rendering of 3D scenes. However, integrating semantic information into this representation remains a significant challenge, especially in maintaining real-time performance for SLAM (Simultaneous Localization and Mapping) applications. In this work, we introduce LEG-SLAM -- a novel approach that fuses an optimized Gaussian Splatting implementation with visual-language feature extraction using DINOv2 followed by a learnable feature compressor based on Principal Component Analysis, while enabling an online dense SLAM. Our method simultaneously generates high-quality photorealistic images and semantically labeled scene maps, achieving real-time scene reconstruction with more than 10 fps on the Replica dataset and 18 fps on ScanNet. Experimental results show that our approach significantly outperforms state-of-the-art methods in reconstruction speed while achieving competitive rendering quality. The proposed system eliminates the need for prior data preparation such as camera's ego motion or pre-computed static semantic maps. With its potential applications in autonomous robotics, augmented reality, and other interactive domains, LEG-SLAM represents a significant step forward in real-time semantic 3D Gaussian-based SLAM. Project page: https://titrom025.github.io/LEG-SLAM/
format Preprint
id arxiv_https___arxiv_org_abs_2506_03073
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM
Titkov, Roman
Zubkov, Egor
Yudin, Dmitry
Mahmoud, Jaafar
Mohrat, Malik
Sidorov, Gennady
Computer Vision and Pattern Recognition
Modern Gaussian Splatting methods have proven highly effective for real-time photorealistic rendering of 3D scenes. However, integrating semantic information into this representation remains a significant challenge, especially in maintaining real-time performance for SLAM (Simultaneous Localization and Mapping) applications. In this work, we introduce LEG-SLAM -- a novel approach that fuses an optimized Gaussian Splatting implementation with visual-language feature extraction using DINOv2 followed by a learnable feature compressor based on Principal Component Analysis, while enabling an online dense SLAM. Our method simultaneously generates high-quality photorealistic images and semantically labeled scene maps, achieving real-time scene reconstruction with more than 10 fps on the Replica dataset and 18 fps on ScanNet. Experimental results show that our approach significantly outperforms state-of-the-art methods in reconstruction speed while achieving competitive rendering quality. The proposed system eliminates the need for prior data preparation such as camera's ego motion or pre-computed static semantic maps. With its potential applications in autonomous robotics, augmented reality, and other interactive domains, LEG-SLAM represents a significant step forward in real-time semantic 3D Gaussian-based SLAM. Project page: https://titrom025.github.io/LEG-SLAM/
title LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2506.03073