Saved in:
Bibliographic Details
Main Authors: Liu, Qiong, Xiong, Ruofei, Chen, Xingzhen, Peng, Muyao, Yang, You
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.00372
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914435952541696
author Liu, Qiong
Xiong, Ruofei
Chen, Xingzhen
Peng, Muyao
Yang, You
author_facet Liu, Qiong
Xiong, Ruofei
Chen, Xingzhen
Peng, Muyao
Yang, You
contents Multi-modality of color and depth, i.e., RGB-D, is of great importance in recent research of indoor scene recognition. In this kind of data representation, depth map is able to describe the 3D structure of scenes and geometric relations among objects. Previous works showed that local features of both modalities are vital for promotion of recognition accuracy. However, the problem of adaptive selection and effective exploitation on these key local features remains open in this field. In this paper, a dynamic graph model is proposed with adaptive node selection mechanism to solve the above problem. In this model, a dynamic graph is built up to model the relations among objects and scene, and a method of adaptive node selection is proposed to take key local features from both modalities of RGB and depth for graph modeling. After that, these nodes are grouped by three different levels, representing near or far relations among objects. Moreover, the graph model is updated dynamically according to attention weights. Finally, the updated and optimized features of RGB and depth modalities are fused together for indoor scene recognition. Experiments are performed on public datasets including SUN RGB-D and NYU Depth v2. Extensive results demonstrate that our method has superior performance when comparing to state-of-the-arts methods, and show that the proposed method is able to exploit crucial local features from both modalities of RGB and depth.
format Preprint
id arxiv_https___arxiv_org_abs_2604_00372
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Dynamic Graph Neural Network with Adaptive Features Selection for RGB-D Based Indoor Scene Recognition
Liu, Qiong
Xiong, Ruofei
Chen, Xingzhen
Peng, Muyao
Yang, You
Computer Vision and Pattern Recognition
Multi-modality of color and depth, i.e., RGB-D, is of great importance in recent research of indoor scene recognition. In this kind of data representation, depth map is able to describe the 3D structure of scenes and geometric relations among objects. Previous works showed that local features of both modalities are vital for promotion of recognition accuracy. However, the problem of adaptive selection and effective exploitation on these key local features remains open in this field. In this paper, a dynamic graph model is proposed with adaptive node selection mechanism to solve the above problem. In this model, a dynamic graph is built up to model the relations among objects and scene, and a method of adaptive node selection is proposed to take key local features from both modalities of RGB and depth for graph modeling. After that, these nodes are grouped by three different levels, representing near or far relations among objects. Moreover, the graph model is updated dynamically according to attention weights. Finally, the updated and optimized features of RGB and depth modalities are fused together for indoor scene recognition. Experiments are performed on public datasets including SUN RGB-D and NYU Depth v2. Extensive results demonstrate that our method has superior performance when comparing to state-of-the-arts methods, and show that the proposed method is able to exploit crucial local features from both modalities of RGB and depth.
title Dynamic Graph Neural Network with Adaptive Features Selection for RGB-D Based Indoor Scene Recognition
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2604.00372