Saved in:
Bibliographic Details
Main Authors: Yang, Haiqi, Li, Zhiyuan, Chang, Yi, Wu, Yuan
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2506.06708
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915332966318080
author Yang, Haiqi
Li, Zhiyuan
Chang, Yi
Wu, Yuan
author_facet Yang, Haiqi
Li, Zhiyuan
Chang, Yi
Wu, Yuan
contents Retentive Network (RetNet) represents a significant advancement in neural network architecture, offering an efficient alternative to the Transformer. While Transformers rely on self-attention to model dependencies, they suffer from high memory costs and limited scalability when handling long sequences due to their quadratic complexity. To mitigate these limitations, RetNet introduces a retention mechanism that unifies the inductive bias of recurrence with the global dependency modeling of attention. This mechanism enables linear-time inference, facilitates efficient modeling of extended contexts, and remains compatible with fully parallelizable training pipelines. RetNet has garnered significant research interest due to its consistently demonstrated cross-domain effectiveness, achieving robust performance across machine learning paradigms including natural language processing, speech recognition, and time-series analysis. However, a comprehensive review of RetNet is still missing from the current literature. This paper aims to fill that gap by offering the first detailed survey of the RetNet architecture, its key innovations, and its diverse applications. We also explore the main challenges associated with RetNet and propose future research directions to support its continued advancement in both academic research and practical deployment.
format Preprint
id arxiv_https___arxiv_org_abs_2506_06708
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Survey of Retentive Network
Yang, Haiqi
Li, Zhiyuan
Chang, Yi
Wu, Yuan
Computation and Language
Retentive Network (RetNet) represents a significant advancement in neural network architecture, offering an efficient alternative to the Transformer. While Transformers rely on self-attention to model dependencies, they suffer from high memory costs and limited scalability when handling long sequences due to their quadratic complexity. To mitigate these limitations, RetNet introduces a retention mechanism that unifies the inductive bias of recurrence with the global dependency modeling of attention. This mechanism enables linear-time inference, facilitates efficient modeling of extended contexts, and remains compatible with fully parallelizable training pipelines. RetNet has garnered significant research interest due to its consistently demonstrated cross-domain effectiveness, achieving robust performance across machine learning paradigms including natural language processing, speech recognition, and time-series analysis. However, a comprehensive review of RetNet is still missing from the current literature. This paper aims to fill that gap by offering the first detailed survey of the RetNet architecture, its key innovations, and its diverse applications. We also explore the main challenges associated with RetNet and propose future research directions to support its continued advancement in both academic research and practical deployment.
title A Survey of Retentive Network
topic Computation and Language
url https://arxiv.org/abs/2506.06708