Saved in:
Bibliographic Details
Main Author: Liang, Dizhen
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.23749
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911043412819968
author Liang, Dizhen
author_facet Liang, Dizhen
contents Transformer-based architectures have achieved remarkable success in natural language processing and computer vision. However, their performance in multivariate long-term forecasting often falls short compared to simpler linear baselines. Previous research has identified the traditional attention mechanism as a key factor limiting their effectiveness in this domain. To bridge this gap, we introduce LATST, a novel approach designed to mitigate entropy collapse and training instability common challenges in Transformer-based time series forecasting. We rigorously evaluate LATST across multiple real-world multivariate time series datasets, demonstrating its ability to outperform existing state-of-the-art Transformer models. Notably, LATST manages to achieve competitive performance with fewer parameters than some linear models on certain datasets, highlighting its efficiency and effectiveness.
format Preprint
id arxiv_https___arxiv_org_abs_2410_23749
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle LATST: Are Transformers Necessarily Complex for Time-Series Forecasting
Liang, Dizhen
Machine Learning
Artificial Intelligence
Transformer-based architectures have achieved remarkable success in natural language processing and computer vision. However, their performance in multivariate long-term forecasting often falls short compared to simpler linear baselines. Previous research has identified the traditional attention mechanism as a key factor limiting their effectiveness in this domain. To bridge this gap, we introduce LATST, a novel approach designed to mitigate entropy collapse and training instability common challenges in Transformer-based time series forecasting. We rigorously evaluate LATST across multiple real-world multivariate time series datasets, demonstrating its ability to outperform existing state-of-the-art Transformer models. Notably, LATST manages to achieve competitive performance with fewer parameters than some linear models on certain datasets, highlighting its efficiency and effectiveness.
title LATST: Are Transformers Necessarily Complex for Time-Series Forecasting
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2410.23749