Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Azarkhalili, Behrooz, Libbrecht, Maxwell
Format:	Preprint
Publié:	2025
Sujets:	Machine Learning Artificial Intelligence
Accès en ligne:	https://arxiv.org/abs/2502.15765
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866917933707427840
author	Azarkhalili, Behrooz Libbrecht, Maxwell
author_facet	Azarkhalili, Behrooz Libbrecht, Maxwell
contents	This paper introduces Generalized Attention Flow (GAF), a novel feature attribution method for Transformer-based models to address the limitations of current approaches. By extending Attention Flow and replacing attention weights with the generalized Information Tensor, GAF integrates attention weights, their gradients, the maximum flow problem, and the barrier method to enhance the performance of feature attributions. The proposed method exhibits key theoretical properties and mitigates the shortcomings of prior techniques that rely solely on simple aggregation of attention weights. Our comprehensive benchmarking on sequence classification tasks demonstrates that a specific variant of GAF consistently outperforms state-of-the-art feature attribution methods in most evaluation settings, providing a more reliable interpretation of Transformer model outputs.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_15765
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow Azarkhalili, Behrooz Libbrecht, Maxwell Machine Learning Artificial Intelligence This paper introduces Generalized Attention Flow (GAF), a novel feature attribution method for Transformer-based models to address the limitations of current approaches. By extending Attention Flow and replacing attention weights with the generalized Information Tensor, GAF integrates attention weights, their gradients, the maximum flow problem, and the barrier method to enhance the performance of feature attributions. The proposed method exhibits key theoretical properties and mitigates the shortcomings of prior techniques that rely solely on simple aggregation of attention weights. Our comprehensive benchmarking on sequence classification tasks demonstrates that a specific variant of GAF consistently outperforms state-of-the-art feature attribution methods in most evaluation settings, providing a more reliable interpretation of Transformer model outputs.
title	Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2502.15765

Documents similaires