Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ouyang, Yuanbing, Liang, Yizhuo, Li, Qingpeng, Guo, Xinfei, Luo, Yiming, Wu, Di, Wang, Hao, Pan, Yushan
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2504.17996
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912346388037632
author	Ouyang, Yuanbing Liang, Yizhuo Li, Qingpeng Guo, Xinfei Luo, Yiming Wu, Di Wang, Hao Pan, Yushan
author_facet	Ouyang, Yuanbing Liang, Yizhuo Li, Qingpeng Guo, Xinfei Luo, Yiming Wu, Di Wang, Hao Pan, Yushan
contents	Vision Transformers (ViTs) excel in semantic segmentation but demand significant computation, posing challenges for deployment on resource-constrained devices. Existing token pruning methods often overlook fundamental visual data characteristics. This study introduces 'LVTP', a progressive token pruning framework guided by multi-scale Tsallis entropy and low-level visual features with twice clustering. It integrates high-level semantics and basic visual attributes for precise segmentation. A novel dynamic scoring mechanism using multi-scale Tsallis entropy weighting overcomes limitations of traditional single-parameter entropy. The framework also incorporates low-level feature analysis to preserve critical edge information while optimizing computational cost. As a plug-and-play module, it requires no architectural changes or additional training. Evaluations across multiple datasets show 20%-45% computational reductions with negligible performance loss, outperforming existing methods in balancing cost and accuracy, especially in complex edge regions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_17996
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Back to Fundamentals: Low-Level Visual Features Guided Progressive Token Pruning Ouyang, Yuanbing Liang, Yizhuo Li, Qingpeng Guo, Xinfei Luo, Yiming Wu, Di Wang, Hao Pan, Yushan Computer Vision and Pattern Recognition Vision Transformers (ViTs) excel in semantic segmentation but demand significant computation, posing challenges for deployment on resource-constrained devices. Existing token pruning methods often overlook fundamental visual data characteristics. This study introduces 'LVTP', a progressive token pruning framework guided by multi-scale Tsallis entropy and low-level visual features with twice clustering. It integrates high-level semantics and basic visual attributes for precise segmentation. A novel dynamic scoring mechanism using multi-scale Tsallis entropy weighting overcomes limitations of traditional single-parameter entropy. The framework also incorporates low-level feature analysis to preserve critical edge information while optimizing computational cost. As a plug-and-play module, it requires no architectural changes or additional training. Evaluations across multiple datasets show 20%-45% computational reductions with negligible performance loss, outperforming existing methods in balancing cost and accuracy, especially in complex edge regions.
title	Back to Fundamentals: Low-Level Visual Features Guided Progressive Token Pruning
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2504.17996

Similar Items