Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Wang, Xiao, Wang, Shiao, Ding, Yuhe, Li, Yuehang, Wu, Wentao, Rong, Yao, Kong, Weizhe, Huang, Ju, Li, Shihao, Yang, Haoxiang, Wang, Ziwen, Jiang, Bo, Li, Chenglong, Wang, Yaowei, Tian, Yonghong, Tang, Jin
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2404.09516
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866914754427092992
author Wang, Xiao
Wang, Shiao
Ding, Yuhe
Li, Yuehang
Wu, Wentao
Rong, Yao
Kong, Weizhe
Huang, Ju
Li, Shihao
Yang, Haoxiang
Wang, Ziwen
Jiang, Bo
Li, Chenglong
Wang, Yaowei
Tian, Yonghong
Tang, Jin
author_facet Wang, Xiao
Wang, Shiao
Ding, Yuhe
Li, Yuehang
Wu, Wentao
Rong, Yao
Kong, Weizhe
Huang, Ju
Li, Shihao
Yang, Haoxiang
Wang, Ziwen
Jiang, Bo
Li, Chenglong
Wang, Yaowei
Tian, Yonghong
Tang, Jin
contents In the post-deep learning era, the Transformer architecture has demonstrated its powerful performance across pre-trained big models and various downstream tasks. However, the enormous computational demands of this architecture have deterred many researchers. To further reduce the complexity of attention models, numerous efforts have been made to design more efficient methods. Among them, the State Space Model (SSM), as a possible replacement for the self-attention based Transformer model, has drawn more and more attention in recent years. In this paper, we give the first comprehensive review of these works and also provide experimental comparisons and analysis to better demonstrate the features and advantages of SSM. Specifically, we first give a detailed description of principles to help the readers quickly capture the key ideas of SSM. After that, we dive into the reviews of existing SSMs and their various applications, including natural language processing, computer vision, graph, multi-modal and multi-media, point cloud/event stream, time series data, and other domains. In addition, we give statistical comparisons and analysis of these models and hope it helps the readers to understand the effectiveness of different structures on various tasks. Then, we propose possible research points in this direction to better promote the development of the theoretical model and application of SSM. More related works will be continuously updated on the following GitHub: https://github.com/Event-AHU/Mamba_State_Space_Model_Paper_List.
format Preprint
id arxiv_https___arxiv_org_abs_2404_09516
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle State Space Model for New-Generation Network Alternative to Transformers: A Survey
Wang, Xiao
Wang, Shiao
Ding, Yuhe
Li, Yuehang
Wu, Wentao
Rong, Yao
Kong, Weizhe
Huang, Ju
Li, Shihao
Yang, Haoxiang
Wang, Ziwen
Jiang, Bo
Li, Chenglong
Wang, Yaowei
Tian, Yonghong
Tang, Jin
Machine Learning
Artificial Intelligence
Computation and Language
Computer Vision and Pattern Recognition
Multimedia
In the post-deep learning era, the Transformer architecture has demonstrated its powerful performance across pre-trained big models and various downstream tasks. However, the enormous computational demands of this architecture have deterred many researchers. To further reduce the complexity of attention models, numerous efforts have been made to design more efficient methods. Among them, the State Space Model (SSM), as a possible replacement for the self-attention based Transformer model, has drawn more and more attention in recent years. In this paper, we give the first comprehensive review of these works and also provide experimental comparisons and analysis to better demonstrate the features and advantages of SSM. Specifically, we first give a detailed description of principles to help the readers quickly capture the key ideas of SSM. After that, we dive into the reviews of existing SSMs and their various applications, including natural language processing, computer vision, graph, multi-modal and multi-media, point cloud/event stream, time series data, and other domains. In addition, we give statistical comparisons and analysis of these models and hope it helps the readers to understand the effectiveness of different structures on various tasks. Then, we propose possible research points in this direction to better promote the development of the theoretical model and application of SSM. More related works will be continuously updated on the following GitHub: https://github.com/Event-AHU/Mamba_State_Space_Model_Paper_List.
title State Space Model for New-Generation Network Alternative to Transformers: A Survey
topic Machine Learning
Artificial Intelligence
Computation and Language
Computer Vision and Pattern Recognition
Multimedia
url https://arxiv.org/abs/2404.09516