Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ghaderi, Seyed Himan, Azad, Saeed Sarbazi, Jaziriyan, Mohammad Mehdi, Akbari, Ahmad
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2602.20892
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914347262935040
author	Ghaderi, Seyed Himan Azad, Saeed Sarbazi Jaziriyan, Mohammad Mehdi Akbari, Ahmad
author_facet	Ghaderi, Seyed Himan Azad, Saeed Sarbazi Jaziriyan, Mohammad Mehdi Akbari, Ahmad
contents	Today, Social networks such as Twitter are the most widely used platforms for communication of people. Analyzing this data has useful information to recognize the opinion of people in tweets. Sentiment analysis plays a vital role in NLP, which identifies the opinion of the individuals about a specific topic. Natural language processing in Persian has many challenges despite the adventure of strong language models. The datasets available in Persian are generally in special topics such as products, foods, hotels, etc while users may use ironies, colloquial phrases in social media To overcome these challenges, there is a necessity for having a dataset of Persian sentiment analysis on Twitter. In this paper, we introduce the Exa sentiment analysis Persian dataset, which is collected from Persian tweets. This dataset contains 12,000 tweets, annotated by 5 native Persian taggers. The aforementioned data is labeled in 3 classes: positive, neutral and negative. We present the characteristics and statistics of this dataset and use the pre-trained Pars Bert and Roberta as the base model to evaluate this dataset. Our evaluation reached a 79.87 Macro F-score, which shows the model and data can be adequately valuable for a sentiment analysis system.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_20892
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Exa-PSD: a new Persian sentiment analysis dataset on Twitter Ghaderi, Seyed Himan Azad, Saeed Sarbazi Jaziriyan, Mohammad Mehdi Akbari, Ahmad Computation and Language Today, Social networks such as Twitter are the most widely used platforms for communication of people. Analyzing this data has useful information to recognize the opinion of people in tweets. Sentiment analysis plays a vital role in NLP, which identifies the opinion of the individuals about a specific topic. Natural language processing in Persian has many challenges despite the adventure of strong language models. The datasets available in Persian are generally in special topics such as products, foods, hotels, etc while users may use ironies, colloquial phrases in social media To overcome these challenges, there is a necessity for having a dataset of Persian sentiment analysis on Twitter. In this paper, we introduce the Exa sentiment analysis Persian dataset, which is collected from Persian tweets. This dataset contains 12,000 tweets, annotated by 5 native Persian taggers. The aforementioned data is labeled in 3 classes: positive, neutral and negative. We present the characteristics and statistics of this dataset and use the pre-trained Pars Bert and Roberta as the base model to evaluate this dataset. Our evaluation reached a 79.87 Macro F-score, which shows the model and data can be adequately valuable for a sentiment analysis system.
title	Exa-PSD: a new Persian sentiment analysis dataset on Twitter
topic	Computation and Language
url	https://arxiv.org/abs/2602.20892

Similar Items