Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhai, Huiyu, Jin, Guang, Yang, Xingxing, Kang, Guosheng
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2408.08087
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916358797656064
author	Zhai, Huiyu Jin, Guang Yang, Xingxing Kang, Guosheng
author_facet	Zhai, Huiyu Jin, Guang Yang, Xingxing Kang, Guosheng
contents	Translating NIR to the visible spectrum is challenging due to cross-domain complexities. Current models struggle to balance a broad receptive field with computational efficiency, limiting practical use. Although the Selective Structured State Space Model, especially the improved version, Mamba, excels in generative tasks by capturing long-range dependencies with linear complexity, its default approach of converting 2D images into 1D sequences neglects local context. In this work, we propose a simple but effective backbone, dubbed ColorMamba, which first introduces Mamba into spectral translation tasks. To explore global long-range dependencies and local context for efficient spectral translation, we introduce learnable padding tokens to enhance the distinction of image boundaries and prevent potential confusion within the sequence model. Furthermore, local convolutional enhancement and agent attention are designed to improve the vanilla Mamba. Moreover, we exploit the HSV color to provide multi-scale guidance in the reconstruction process for more accurate spectral translation. Extensive experiments show that our ColorMamba achieves a 1.02 improvement in terms of PSNR compared with the state-of-the-art method. Our code is available at https://github.com/AlexYangxx/ColorMamba.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_08087
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	ColorMamba: Towards High-quality NIR-to-RGB Spectral Translation with Mamba Zhai, Huiyu Jin, Guang Yang, Xingxing Kang, Guosheng Computer Vision and Pattern Recognition Translating NIR to the visible spectrum is challenging due to cross-domain complexities. Current models struggle to balance a broad receptive field with computational efficiency, limiting practical use. Although the Selective Structured State Space Model, especially the improved version, Mamba, excels in generative tasks by capturing long-range dependencies with linear complexity, its default approach of converting 2D images into 1D sequences neglects local context. In this work, we propose a simple but effective backbone, dubbed ColorMamba, which first introduces Mamba into spectral translation tasks. To explore global long-range dependencies and local context for efficient spectral translation, we introduce learnable padding tokens to enhance the distinction of image boundaries and prevent potential confusion within the sequence model. Furthermore, local convolutional enhancement and agent attention are designed to improve the vanilla Mamba. Moreover, we exploit the HSV color to provide multi-scale guidance in the reconstruction process for more accurate spectral translation. Extensive experiments show that our ColorMamba achieves a 1.02 improvement in terms of PSNR compared with the state-of-the-art method. Our code is available at https://github.com/AlexYangxx/ColorMamba.
title	ColorMamba: Towards High-quality NIR-to-RGB Spectral Translation with Mamba
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2408.08087

Similar Items