Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sun, Shoukun, Wang, Zhe, Que, Xiang, Zhang, Jiyin, Ma, Xiaogang
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.19736
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910045754621952
author	Sun, Shoukun Wang, Zhe Que, Xiang Zhang, Jiyin Ma, Xiaogang
author_facet	Sun, Shoukun Wang, Zhe Que, Xiang Zhang, Jiyin Ma, Xiaogang
contents	While diffusion models have achieved state-of-the-art performance in Image Super-Resolution (SR), their prohibitive computational and memory demands restrict their training and inference to fixed-size inputs. The standard workaround to super-resolve larger images relies on partitioning the image, super-resolving patches independently, and stitching them together -- a process that inevitably introduces severe boundary artifacts and spatial inconsistencies in large-scale scenes. To achieve spatially continuous, arbitrary-size image super-resolution, we propose InfScene-SR, a diffusion-based SR approach. Building upon SR3, our approach leverages Variance-Corrected Fusion (VCF) to perform joint-denoising across overlapping patches. VCF guarantees continuous transitions while preserving the stochastic variance crucial for high-fidelity texture reconstruction. To overcome the prohibitive synchronization overhead of scaling joint-denoising to gigapixel imagery, we introduce Spatially-Decoupled Variance Correction (SDVC). SDVC reformulates the global fusion process into independent, atomic patch operations, drastically reducing memory complexity to $\mathcal{O}(1)$ and naturally enabling fully distributed, parallelized inference. Extensive experiments on large-scale remote sensing datasets demonstrate that InfScene-SR strictly eliminates boundary seams, achieves superior perceptual quality, and significantly boosts performance in downstream semantic segmentation task.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_19736
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	InfScene-SR: Arbitrary-Size Image Super-Resolution via Iterative Joint-Denoising Sun, Shoukun Wang, Zhe Que, Xiang Zhang, Jiyin Ma, Xiaogang Computer Vision and Pattern Recognition While diffusion models have achieved state-of-the-art performance in Image Super-Resolution (SR), their prohibitive computational and memory demands restrict their training and inference to fixed-size inputs. The standard workaround to super-resolve larger images relies on partitioning the image, super-resolving patches independently, and stitching them together -- a process that inevitably introduces severe boundary artifacts and spatial inconsistencies in large-scale scenes. To achieve spatially continuous, arbitrary-size image super-resolution, we propose InfScene-SR, a diffusion-based SR approach. Building upon SR3, our approach leverages Variance-Corrected Fusion (VCF) to perform joint-denoising across overlapping patches. VCF guarantees continuous transitions while preserving the stochastic variance crucial for high-fidelity texture reconstruction. To overcome the prohibitive synchronization overhead of scaling joint-denoising to gigapixel imagery, we introduce Spatially-Decoupled Variance Correction (SDVC). SDVC reformulates the global fusion process into independent, atomic patch operations, drastically reducing memory complexity to $\mathcal{O}(1)$ and naturally enabling fully distributed, parallelized inference. Extensive experiments on large-scale remote sensing datasets demonstrate that InfScene-SR strictly eliminates boundary seams, achieves superior perceptual quality, and significantly boosts performance in downstream semantic segmentation task.
title	InfScene-SR: Arbitrary-Size Image Super-Resolution via Iterative Joint-Denoising
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2602.19736

Similar Items