Saved in:
Bibliographic Details
Main Authors: Dong, Chengqi, Cao, Zhiyuan, Qi, Tuoshi, Wu, Kexin, Gao, Yixing, Tang, Fan
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.07428
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912887311695872
author Dong, Chengqi
Cao, Zhiyuan
Qi, Tuoshi
Wu, Kexin
Gao, Yixing
Tang, Fan
author_facet Dong, Chengqi
Cao, Zhiyuan
Qi, Tuoshi
Wu, Kexin
Gao, Yixing
Tang, Fan
contents U-Net structure is widely used for low-light image/video enhancement. The enhanced images result in areas with large local noise and loss of more details without proper guidance for global information. Attention mechanisms can better focus on and use global information. However, attention to images could significantly increase the number of parameters and computations. We propose a Row-Column Separated Attention module (RCSA) inserted after an improved U-Net. The RCSA module's input is the mean and maximum of the row and column of the feature map, which utilizes global information to guide local information with fewer parameters. We propose two temporal loss functions to apply the method to low-light video enhancement and maintain temporal consistency. Extensive experiments on the LOL, MIT Adobe FiveK image, and SDSD video datasets demonstrate the effectiveness of our approach. The code is publicly available at https://github.com/cq-dong/URCSA.
format Preprint
id arxiv_https___arxiv_org_abs_2602_07428
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Row-Column Separated Attention Based Low-Light Image/Video Enhancement
Dong, Chengqi
Cao, Zhiyuan
Qi, Tuoshi
Wu, Kexin
Gao, Yixing
Tang, Fan
Computer Vision and Pattern Recognition
U-Net structure is widely used for low-light image/video enhancement. The enhanced images result in areas with large local noise and loss of more details without proper guidance for global information. Attention mechanisms can better focus on and use global information. However, attention to images could significantly increase the number of parameters and computations. We propose a Row-Column Separated Attention module (RCSA) inserted after an improved U-Net. The RCSA module's input is the mean and maximum of the row and column of the feature map, which utilizes global information to guide local information with fewer parameters. We propose two temporal loss functions to apply the method to low-light video enhancement and maintain temporal consistency. Extensive experiments on the LOL, MIT Adobe FiveK image, and SDSD video datasets demonstrate the effectiveness of our approach. The code is publicly available at https://github.com/cq-dong/URCSA.
title Row-Column Separated Attention Based Low-Light Image/Video Enhancement
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2602.07428