Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lee, Jaehyuk, Kim, Hanyoung, Kim, Yanggee, Lee, Donghun
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.22372
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911704610242560
author	Lee, Jaehyuk Kim, Hanyoung Kim, Yanggee Lee, Donghun
author_facet	Lee, Jaehyuk Kim, Hanyoung Kim, Yanggee Lee, Donghun
contents	Vision Transformers (ViTs) face severe computational bottlenecks due to the quadratic complexity of self-attention at high resolutions. Existing token reduction methods rely on local metrics - such as single-layer attention scores - that are inherently vulnerable to the attention sink phenomenon, where uninformative tokens are paradoxically preserved over salient foreground objects. We propose ASAP (Attention Sink Anchored Pruning), a training-free framework that recasts this sink as a feature. Modeling ViT information flow as a Lazy Random Walk, ASAP identifies the sink as a dominant accumulator of probability mass. By computing the diffusion distance to the sink within the cumulative transition matrix, ASAP partitions tokens via Radial Diffusion Clustering and compresses background redundancy through Transition Weight Pooling in a single shot. Extensive experiments across image, video, and vision-language tasks demonstrate ASAP outperforms state-of-the-art methods, accelerating throughput by up to 48% while maintaining - or even exceeding - baseline accuracy.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_22372
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	ASAP: Attention Sink Anchored Pruning Lee, Jaehyuk Kim, Hanyoung Kim, Yanggee Lee, Donghun Machine Learning Vision Transformers (ViTs) face severe computational bottlenecks due to the quadratic complexity of self-attention at high resolutions. Existing token reduction methods rely on local metrics - such as single-layer attention scores - that are inherently vulnerable to the attention sink phenomenon, where uninformative tokens are paradoxically preserved over salient foreground objects. We propose ASAP (Attention Sink Anchored Pruning), a training-free framework that recasts this sink as a feature. Modeling ViT information flow as a Lazy Random Walk, ASAP identifies the sink as a dominant accumulator of probability mass. By computing the diffusion distance to the sink within the cumulative transition matrix, ASAP partitions tokens via Radial Diffusion Clustering and compresses background redundancy through Transition Weight Pooling in a single shot. Extensive experiments across image, video, and vision-language tasks demonstrate ASAP outperforms state-of-the-art methods, accelerating throughput by up to 48% while maintaining - or even exceeding - baseline accuracy.
title	ASAP: Attention Sink Anchored Pruning
topic	Machine Learning
url	https://arxiv.org/abs/2605.22372

Similar Items