_version_ 1866929525285191680
author Du, Jiangshu
Wang, Yibo
Zhao, Wenting
Deng, Zhongfen
Liu, Shuaiqi
Lou, Renze
Zou, Henry Peng
Venkit, Pranav Narayanan
Zhang, Nan
Srinath, Mukund
Zhang, Haoran Ranran
Gupta, Vipul
Li, Yinghui
Li, Tao
Wang, Fei
Liu, Qin
Liu, Tianlin
Gao, Pengzhi
Xia, Congying
Xing, Chen
Cheng, Jiayang
Wang, Zhaowei
Su, Ying
Shah, Raj Sanjay
Guo, Ruohao
Gu, Jing
Li, Haoran
Wei, Kangda
Wang, Zihao
Cheng, Lu
Ranathunga, Surangika
Fang, Meng
Fu, Jie
Liu, Fei
Huang, Ruihong
Blanco, Eduardo
Cao, Yixin
Zhang, Rui
Yu, Philip S.
Yin, Wenpeng
author_facet Du, Jiangshu
Wang, Yibo
Zhao, Wenting
Deng, Zhongfen
Liu, Shuaiqi
Lou, Renze
Zou, Henry Peng
Venkit, Pranav Narayanan
Zhang, Nan
Srinath, Mukund
Zhang, Haoran Ranran
Gupta, Vipul
Li, Yinghui
Li, Tao
Wang, Fei
Liu, Qin
Liu, Tianlin
Gao, Pengzhi
Xia, Congying
Xing, Chen
Cheng, Jiayang
Wang, Zhaowei
Su, Ying
Shah, Raj Sanjay
Guo, Ruohao
Gu, Jing
Li, Haoran
Wei, Kangda
Wang, Zihao
Cheng, Lu
Ranathunga, Surangika
Fang, Meng
Fu, Jie
Liu, Fei
Huang, Ruihong
Blanco, Eduardo
Cao, Yixin
Zhang, Rui
Yu, Philip S.
Yin, Wenpeng
contents This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as they have to spend more time reading, writing, and reviewing papers. This raises the question: how can LLMs potentially assist researchers in alleviating their heavy workload? This study focuses on the topic of LLMs assist NLP Researchers, particularly examining the effectiveness of LLM in assisting paper (meta-)reviewing and its recognizability. To address this, we constructed the ReviewCritique dataset, which includes two types of information: (i) NLP papers (initial submissions rather than camera-ready) with both human-written and LLM-generated reviews, and (ii) each review comes with "deficiency" labels and corresponding explanations for individual segments, annotated by experts. Using ReviewCritique, this study explores two threads of research questions: (i) "LLMs as Reviewers", how do reviews generated by LLMs compare with those written by humans in terms of quality and distinguishability? (ii) "LLMs as Metareviewers", how effectively can LLMs identify potential issues, such as Deficient or unprofessional review segments, within individual paper reviews? To our knowledge, this is the first work to provide such a comprehensive analysis.
format Preprint
id arxiv_https___arxiv_org_abs_2406_16253
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing
Du, Jiangshu
Wang, Yibo
Zhao, Wenting
Deng, Zhongfen
Liu, Shuaiqi
Lou, Renze
Zou, Henry Peng
Venkit, Pranav Narayanan
Zhang, Nan
Srinath, Mukund
Zhang, Haoran Ranran
Gupta, Vipul
Li, Yinghui
Li, Tao
Wang, Fei
Liu, Qin
Liu, Tianlin
Gao, Pengzhi
Xia, Congying
Xing, Chen
Cheng, Jiayang
Wang, Zhaowei
Su, Ying
Shah, Raj Sanjay
Guo, Ruohao
Gu, Jing
Li, Haoran
Wei, Kangda
Wang, Zihao
Cheng, Lu
Ranathunga, Surangika
Fang, Meng
Fu, Jie
Liu, Fei
Huang, Ruihong
Blanco, Eduardo
Cao, Yixin
Zhang, Rui
Yu, Philip S.
Yin, Wenpeng
Computation and Language
This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as they have to spend more time reading, writing, and reviewing papers. This raises the question: how can LLMs potentially assist researchers in alleviating their heavy workload? This study focuses on the topic of LLMs assist NLP Researchers, particularly examining the effectiveness of LLM in assisting paper (meta-)reviewing and its recognizability. To address this, we constructed the ReviewCritique dataset, which includes two types of information: (i) NLP papers (initial submissions rather than camera-ready) with both human-written and LLM-generated reviews, and (ii) each review comes with "deficiency" labels and corresponding explanations for individual segments, annotated by experts. Using ReviewCritique, this study explores two threads of research questions: (i) "LLMs as Reviewers", how do reviews generated by LLMs compare with those written by humans in terms of quality and distinguishability? (ii) "LLMs as Metareviewers", how effectively can LLMs identify potential issues, such as Deficient or unprofessional review segments, within individual paper reviews? To our knowledge, this is the first work to provide such a comprehensive analysis.
title LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing
topic Computation and Language
url https://arxiv.org/abs/2406.16253