Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Boya, Wang, Shuo, Wang, Dong, Ye, Ziwen, Dou
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2309.09272
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929201760698368
author	Boya, Wang Shuo, Wang Dong, Ye Ziwen, Dou
author_facet	Boya, Wang Shuo, Wang Dong, Ye Ziwen, Dou
contents	With the frequent use of self-supervised monocular depth estimation in robotics and autonomous driving, the model's efficiency is becoming increasingly important. Most current approaches apply much larger and more complex networks to improve the precision of depth estimation. Some researchers incorporated Transformer into self-supervised monocular depth estimation to achieve better performance. However, this method leads to high parameters and high computation. We present a fully convolutional depth estimation network using contextual feature fusion. Compared to UNet++ and HRNet, we use high-resolution and low-resolution features to reserve information on small targets and fast-moving objects instead of long-range fusion. We further promote depth estimation results employing lightweight channel attention based on convolution in the decoder stage. Our method reduces the parameters without sacrificing accuracy. Experiments on the KITTI benchmark show that our method can get better results than many large models, such as Monodepth2, with only 30 parameters. The source code is available at https://github.com/boyagesmile/DNA-Depth.
format	Preprint
id	arxiv_https___arxiv_org_abs_2309_09272
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Deep Neighbor Layer Aggregation for Lightweight Self-Supervised Monocular Depth Estimation Boya, Wang Shuo, Wang Dong, Ye Ziwen, Dou Computer Vision and Pattern Recognition Artificial Intelligence With the frequent use of self-supervised monocular depth estimation in robotics and autonomous driving, the model's efficiency is becoming increasingly important. Most current approaches apply much larger and more complex networks to improve the precision of depth estimation. Some researchers incorporated Transformer into self-supervised monocular depth estimation to achieve better performance. However, this method leads to high parameters and high computation. We present a fully convolutional depth estimation network using contextual feature fusion. Compared to UNet++ and HRNet, we use high-resolution and low-resolution features to reserve information on small targets and fast-moving objects instead of long-range fusion. We further promote depth estimation results employing lightweight channel attention based on convolution in the decoder stage. Our method reduces the parameters without sacrificing accuracy. Experiments on the KITTI benchmark show that our method can get better results than many large models, such as Monodepth2, with only 30 parameters. The source code is available at https://github.com/boyagesmile/DNA-Depth.
title	Deep Neighbor Layer Aggregation for Lightweight Self-Supervised Monocular Depth Estimation
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2309.09272

Similar Items