Saved in:
Bibliographic Details
Main Authors: Tran, Hong-Viet, Bui, Van-Tan, Tran, Lam-Quan
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.08758
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916567358373888
author Tran, Hong-Viet
Bui, Van-Tan
Tran, Lam-Quan
author_facet Tran, Hong-Viet
Bui, Van-Tan
Tran, Lam-Quan
contents Sentiment analysis is one of the most crucial tasks in Natural Language Processing (NLP), involving the training of machine learning models to classify text based on the polarity of opinions. Pre-trained Language Models (PLMs) can be applied to downstream tasks through fine-tuning, eliminating the need to train the model from scratch. Specifically, PLMs have been employed for Sentiment Analysis, a process that involves detecting, analyzing, and extracting the polarity of text sentiments. Numerous models have been proposed to address this task, with pre-trained PhoBERT-V2 models standing out as the state-of-the-art language models for Vietnamese. The PhoBERT-V2 pre-training approach is based on RoBERTa, optimizing the BERT pre-training method for more robust performance. In this paper, we introduce a novel approach that combines PhoBERT-V2 and SentiWordnet for Sentiment Analysis of Vietnamese reviews. Our proposed model utilizes PhoBERT-V2 for Vietnamese, offering a robust optimization for the prominent BERT model in the context of Vietnamese language, and leverages SentiWordNet, a lexical resource explicitly designed to support sentiment classification applications. Experimental results on the VLSP 2016 and AIVIVN 2019 datasets demonstrate that our sentiment analysis system has achieved excellent performance in comparison to other models.
format Preprint
id arxiv_https___arxiv_org_abs_2501_08758
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models
Tran, Hong-Viet
Bui, Van-Tan
Tran, Lam-Quan
Computation and Language
Sentiment analysis is one of the most crucial tasks in Natural Language Processing (NLP), involving the training of machine learning models to classify text based on the polarity of opinions. Pre-trained Language Models (PLMs) can be applied to downstream tasks through fine-tuning, eliminating the need to train the model from scratch. Specifically, PLMs have been employed for Sentiment Analysis, a process that involves detecting, analyzing, and extracting the polarity of text sentiments. Numerous models have been proposed to address this task, with pre-trained PhoBERT-V2 models standing out as the state-of-the-art language models for Vietnamese. The PhoBERT-V2 pre-training approach is based on RoBERTa, optimizing the BERT pre-training method for more robust performance. In this paper, we introduce a novel approach that combines PhoBERT-V2 and SentiWordnet for Sentiment Analysis of Vietnamese reviews. Our proposed model utilizes PhoBERT-V2 for Vietnamese, offering a robust optimization for the prominent BERT model in the context of Vietnamese language, and leverages SentiWordNet, a lexical resource explicitly designed to support sentiment classification applications. Experimental results on the VLSP 2016 and AIVIVN 2019 datasets demonstrate that our sentiment analysis system has achieved excellent performance in comparison to other models.
title Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models
topic Computation and Language
url https://arxiv.org/abs/2501.08758