Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Yongxu, Quan, Yinghui, Xiao, Guoyao, Li, Aobo, Wu, Jinjian
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Multimedia
Online Access:	https://arxiv.org/abs/2401.02614
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913186628763648
author	Liu, Yongxu Quan, Yinghui Xiao, Guoyao Li, Aobo Wu, Jinjian
author_facet	Liu, Yongxu Quan, Yinghui Xiao, Guoyao Li, Aobo Wu, Jinjian
contents	Quality assessment of images and videos emphasizes both local details and global semantics, whereas general data sampling methods (e.g., resizing, cropping or grid-based fragment) fail to catch them simultaneously. To address the deficiency, current approaches have to adopt multi-branch models and take as input the multi-resolution data, which burdens the model complexity. In this work, instead of stacking up models, a more elegant data sampling method (named as SAMA, scaling and masking) is explored, which compacts both the local and global content in a regular input size. The basic idea is to scale the data into a pyramid first, and reduce the pyramid into a regular data dimension with a masking strategy. Benefiting from the spatial and temporal redundancy in images and videos, the processed data maintains the multi-scale characteristics with a regular input size, thus can be processed by a single-branch model. We verify the sampling method in image and video quality assessment. Experiments show that our sampling method can improve the performance of current single-branch models significantly, and achieves competitive performance to the multi-branch models without extra model complexity. The source code will be available at https://github.com/Sissuire/SAMA.
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_02614
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment Liu, Yongxu Quan, Yinghui Xiao, Guoyao Li, Aobo Wu, Jinjian Computer Vision and Pattern Recognition Multimedia Quality assessment of images and videos emphasizes both local details and global semantics, whereas general data sampling methods (e.g., resizing, cropping or grid-based fragment) fail to catch them simultaneously. To address the deficiency, current approaches have to adopt multi-branch models and take as input the multi-resolution data, which burdens the model complexity. In this work, instead of stacking up models, a more elegant data sampling method (named as SAMA, scaling and masking) is explored, which compacts both the local and global content in a regular input size. The basic idea is to scale the data into a pyramid first, and reduce the pyramid into a regular data dimension with a masking strategy. Benefiting from the spatial and temporal redundancy in images and videos, the processed data maintains the multi-scale characteristics with a regular input size, thus can be processed by a single-branch model. We verify the sampling method in image and video quality assessment. Experiments show that our sampling method can improve the performance of current single-branch models significantly, and achieves competitive performance to the multi-branch models without extra model complexity. The source code will be available at https://github.com/Sissuire/SAMA.
title	Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment
topic	Computer Vision and Pattern Recognition Multimedia
url	https://arxiv.org/abs/2401.02614

Similar Items