Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Ruiqian, Shen, Siyuan, Xia, Suan, Wang, Ziheng, Peng, Xingyue, Song, Chengxuan, Zhu, Yingsheng, Wu, Tao, Li, Shiying, Yu, Jingyi
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2503.11328
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910875864006656
author	Li, Ruiqian Shen, Siyuan Xia, Suan Wang, Ziheng Peng, Xingyue Song, Chengxuan Zhu, Yingsheng Wu, Tao Li, Shiying Yu, Jingyi
author_facet	Li, Ruiqian Shen, Siyuan Xia, Suan Wang, Ziheng Peng, Xingyue Song, Chengxuan Zhu, Yingsheng Wu, Tao Li, Shiying Yu, Jingyi
contents	High quality and high speed videography using Non-Line-of-Sight (NLOS) imaging benefit autonomous navigation, collision prevention, and post-disaster search and rescue tasks. Current solutions have to balance between the frame rate and image quality. High frame rates, for example, can be achieved by reducing either per-point scanning time or scanning density, but at the cost of lowering the information density at individual frames. Fast scanning process further reduces the signal-to-noise ratio and different scanning systems exhibit different distortion characteristics. In this work, we design and employ a new Transient Transformer architecture called TransiT to achieve real-time NLOS recovery under fast scans. TransiT directly compresses the temporal dimension of input transients to extract features, reducing computation costs and meeting high frame rate requirements. It further adopts a feature fusion mechanism as well as employs a spatial-temporal Transformer to help capture features of NLOS transient videos. Moreover, TransiT applies transfer learning to bridge the gap between synthetic and real-measured data. In real experiments, TransiT manages to reconstruct from sparse transients of $16 \times 16$ measured at an exposure time of 0.4 ms per point to NLOS videos at a $64 \times 64$ resolution at 10 frames per second. We will make our code and dataset available to the community.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_11328
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	TransiT: Transient Transformer for Non-line-of-sight Videography Li, Ruiqian Shen, Siyuan Xia, Suan Wang, Ziheng Peng, Xingyue Song, Chengxuan Zhu, Yingsheng Wu, Tao Li, Shiying Yu, Jingyi Computer Vision and Pattern Recognition High quality and high speed videography using Non-Line-of-Sight (NLOS) imaging benefit autonomous navigation, collision prevention, and post-disaster search and rescue tasks. Current solutions have to balance between the frame rate and image quality. High frame rates, for example, can be achieved by reducing either per-point scanning time or scanning density, but at the cost of lowering the information density at individual frames. Fast scanning process further reduces the signal-to-noise ratio and different scanning systems exhibit different distortion characteristics. In this work, we design and employ a new Transient Transformer architecture called TransiT to achieve real-time NLOS recovery under fast scans. TransiT directly compresses the temporal dimension of input transients to extract features, reducing computation costs and meeting high frame rate requirements. It further adopts a feature fusion mechanism as well as employs a spatial-temporal Transformer to help capture features of NLOS transient videos. Moreover, TransiT applies transfer learning to bridge the gap between synthetic and real-measured data. In real experiments, TransiT manages to reconstruct from sparse transients of $16 \times 16$ measured at an exposure time of 0.4 ms per point to NLOS videos at a $64 \times 64$ resolution at 10 frames per second. We will make our code and dataset available to the community.
title	TransiT: Transient Transformer for Non-line-of-sight Videography
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2503.11328

Similar Items