Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lucas, Evan, Kangas, Dylan, Havens, Timothy C
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2410.08971
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913542039404544
author	Lucas, Evan Kangas, Dylan Havens, Timothy C
author_facet	Lucas, Evan Kangas, Dylan Havens, Timothy C
contents	In this paper, we propose an extension to Longformer Encoder-Decoder, a popular sparse transformer architecture. One common challenge with sparse transformers is that they can struggle with encoding of long range context, such as connections between topics discussed at a beginning and end of a document. A method to selectively increase global attention is proposed and demonstrated for abstractive summarization tasks on several benchmark data sets. By prefixing the transcript with additional keywords and encoding global attention on these keywords, improvement in zero-shot, few-shot, and fine-tuned cases is demonstrated for some benchmark data sets.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_08971
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures Lucas, Evan Kangas, Dylan Havens, Timothy C Computation and Language In this paper, we propose an extension to Longformer Encoder-Decoder, a popular sparse transformer architecture. One common challenge with sparse transformers is that they can struggle with encoding of long range context, such as connections between topics discussed at a beginning and end of a document. A method to selectively increase global attention is proposed and demonstrated for abstractive summarization tasks on several benchmark data sets. By prefixing the transcript with additional keywords and encoding global attention on these keywords, improvement in zero-shot, few-shot, and fine-tuned cases is demonstrated for some benchmark data sets.
title	Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures
topic	Computation and Language
url	https://arxiv.org/abs/2410.08971

Similar Items