Saved in:
Bibliographic Details
Main Authors: Tyser, Keith, Segev, Ben, Longhitano, Gaston, Zhang, Xin-Yu, Meeks, Zachary, Lee, Jason, Garg, Uday, Belsten, Nicholas, Shporer, Avi, Udell, Madeleine, Te'eni, Dov, Drori, Iddo
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.10365
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917753663782912
author Tyser, Keith
Segev, Ben
Longhitano, Gaston
Zhang, Xin-Yu
Meeks, Zachary
Lee, Jason
Garg, Uday
Belsten, Nicholas
Shporer, Avi
Udell, Madeleine
Te'eni, Dov
Drori, Iddo
author_facet Tyser, Keith
Segev, Ben
Longhitano, Gaston
Zhang, Xin-Yu
Meeks, Zachary
Lee, Jason
Garg, Uday
Belsten, Nicholas
Shporer, Avi
Udell, Madeleine
Te'eni, Dov
Drori, Iddo
contents Automatic reviewing helps handle a large volume of papers, provides early feedback and quality control, reduces bias, and allows the analysis of trends. We evaluate the alignment of automatic paper reviews with human reviews using an arena of human preferences by pairwise comparisons. Gathering human preference may be time-consuming; therefore, we also use an LLM to automatically evaluate reviews to increase sample efficiency while reducing bias. In addition to evaluating human and LLM preferences among LLM reviews, we fine-tune an LLM to predict human preferences, predicting which reviews humans will prefer in a head-to-head battle between LLMs. We artificially introduce errors into papers and analyze the LLM's responses to identify limitations, use adaptive review questions, meta prompting, role-playing, integrate visual and textual analysis, use venue-specific reviewing materials, and predict human preferences, improving upon the limitations of the traditional review processes. We make the reviews of publicly available arXiv and open-access Nature journal papers available online, along with a free service which helps authors review and revise their research papers and improve their quality. This work develops proof-of-concept LLM reviewing systems that quickly deliver consistent, high-quality reviews and evaluate their quality. We mitigate the risks of misuse, inflated review scores, overconfident ratings, and skewed score distributions by augmenting the LLM with multiple documents, including the review form, reviewer guide, code of ethics and conduct, area chair guidelines, and previous year statistics, by finding which errors and shortcomings of the paper may be detected by automated reviews, and evaluating pairwise reviewer preferences. This work identifies and addresses the limitations of using LLMs as reviewers and evaluators and enhances the quality of the reviewing process.
format Preprint
id arxiv_https___arxiv_org_abs_2408_10365
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle AI-Driven Review Systems: Evaluating LLMs in Scalable and Bias-Aware Academic Reviews
Tyser, Keith
Segev, Ben
Longhitano, Gaston
Zhang, Xin-Yu
Meeks, Zachary
Lee, Jason
Garg, Uday
Belsten, Nicholas
Shporer, Avi
Udell, Madeleine
Te'eni, Dov
Drori, Iddo
Artificial Intelligence
Automatic reviewing helps handle a large volume of papers, provides early feedback and quality control, reduces bias, and allows the analysis of trends. We evaluate the alignment of automatic paper reviews with human reviews using an arena of human preferences by pairwise comparisons. Gathering human preference may be time-consuming; therefore, we also use an LLM to automatically evaluate reviews to increase sample efficiency while reducing bias. In addition to evaluating human and LLM preferences among LLM reviews, we fine-tune an LLM to predict human preferences, predicting which reviews humans will prefer in a head-to-head battle between LLMs. We artificially introduce errors into papers and analyze the LLM's responses to identify limitations, use adaptive review questions, meta prompting, role-playing, integrate visual and textual analysis, use venue-specific reviewing materials, and predict human preferences, improving upon the limitations of the traditional review processes. We make the reviews of publicly available arXiv and open-access Nature journal papers available online, along with a free service which helps authors review and revise their research papers and improve their quality. This work develops proof-of-concept LLM reviewing systems that quickly deliver consistent, high-quality reviews and evaluate their quality. We mitigate the risks of misuse, inflated review scores, overconfident ratings, and skewed score distributions by augmenting the LLM with multiple documents, including the review form, reviewer guide, code of ethics and conduct, area chair guidelines, and previous year statistics, by finding which errors and shortcomings of the paper may be detected by automated reviews, and evaluating pairwise reviewer preferences. This work identifies and addresses the limitations of using LLMs as reviewers and evaluators and enhances the quality of the reviewing process.
title AI-Driven Review Systems: Evaluating LLMs in Scalable and Bias-Aware Academic Reviews
topic Artificial Intelligence
url https://arxiv.org/abs/2408.10365