Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tan, Chee Heng, Zheng, Huiying, Wang, Jing, Lin, Zhuoyi, Feng, Shaodi, Zhan, Huijing, Li, Xiaoli, Senthilnath, J.
Format:	Preprint
Published:	2025
Subjects:	Information Retrieval
Online Access:	https://arxiv.org/abs/2512.12978
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911319059333120
author	Tan, Chee Heng Zheng, Huiying Wang, Jing Lin, Zhuoyi Feng, Shaodi Zhan, Huijing Li, Xiaoli Senthilnath, J.
author_facet	Tan, Chee Heng Zheng, Huiying Wang, Jing Lin, Zhuoyi Feng, Shaodi Zhan, Huijing Li, Xiaoli Senthilnath, J.
contents	With the advent of large language models (LLMs), the landscape of recommender systems is undergoing a significant transformation. Traditionally, user reviews have served as a critical source of rich, contextual information for enhancing recommendation quality. However, as LLMs demonstrate an unprecedented ability to understand and generate human-like text, this raises the question of whether explicit user reviews remain essential in the era of LLMs. In this paper, we provide a systematic investigation of the evolving role of text reviews in recommendation by comparing deep learning methods and LLM approaches. Particularly, we conduct extensive experiments on eight public datasets with LLMs and evaluate their performance in zero-shot, few-shot, and fine-tuning scenarios. We further introduce a benchmarking evaluation framework for review-aware recommender systems, RAREval, to comprehensively assess the contribution of textual reviews to the recommendation performance of review-aware recommender systems. Our framework examines various scenarios, including the removal of some or all textual reviews, random distortion, as well as recommendation performance in data sparsity and cold-start user settings. Our findings demonstrate that LLMs are capable of functioning as effective review-aware recommendation engines, generally outperforming traditional deep learning approaches, particularly in scenarios characterized by data sparsity and cold-start conditions. In addition, the removal of some or all textual reviews and random distortion does not necessarily lead to declines in recommendation accuracy. These findings motivate a rethinking of how user preference from text reviews can be more effectively leveraged. All code and supplementary materials are available at: https://github.com/zhytk/RAREval-data-processing.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_12978
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Do Reviews Matter for Recommendations in the Era of Large Language Models? Tan, Chee Heng Zheng, Huiying Wang, Jing Lin, Zhuoyi Feng, Shaodi Zhan, Huijing Li, Xiaoli Senthilnath, J. Information Retrieval With the advent of large language models (LLMs), the landscape of recommender systems is undergoing a significant transformation. Traditionally, user reviews have served as a critical source of rich, contextual information for enhancing recommendation quality. However, as LLMs demonstrate an unprecedented ability to understand and generate human-like text, this raises the question of whether explicit user reviews remain essential in the era of LLMs. In this paper, we provide a systematic investigation of the evolving role of text reviews in recommendation by comparing deep learning methods and LLM approaches. Particularly, we conduct extensive experiments on eight public datasets with LLMs and evaluate their performance in zero-shot, few-shot, and fine-tuning scenarios. We further introduce a benchmarking evaluation framework for review-aware recommender systems, RAREval, to comprehensively assess the contribution of textual reviews to the recommendation performance of review-aware recommender systems. Our framework examines various scenarios, including the removal of some or all textual reviews, random distortion, as well as recommendation performance in data sparsity and cold-start user settings. Our findings demonstrate that LLMs are capable of functioning as effective review-aware recommendation engines, generally outperforming traditional deep learning approaches, particularly in scenarios characterized by data sparsity and cold-start conditions. In addition, the removal of some or all textual reviews and random distortion does not necessarily lead to declines in recommendation accuracy. These findings motivate a rethinking of how user preference from text reviews can be more effectively leveraged. All code and supplementary materials are available at: https://github.com/zhytk/RAREval-data-processing.
title	Do Reviews Matter for Recommendations in the Era of Large Language Models?
topic	Information Retrieval
url	https://arxiv.org/abs/2512.12978

Similar Items