Sommario: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autore principale:	Wen, Yingxuan
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Computation and Language Machine Learning
Accesso online:	https://arxiv.org/abs/2604.14162
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Sommario:

Authors often struggle to interpret peer review feedback, deriving false hope from polite comments or feeling confused by specific low scores. To investigate this, we construct a dataset of over 30,000 ICLR 2021-2025 submissions and compare acceptance prediction performance using numerical scores versus text reviews. Our experiments reveal a significant performance gap: score-based models achieve 91% accuracy, while text-based models reach only 81% even with large language models, indicating that textual information is considerably less reliable. To explain this phenomenon, we first analyze the 9% of samples that score-based models fail to predict, finding their score distributions exhibit high kurtosis and negative skewness, which suggests that individual low scores play a decisive role in rejection even when the average score falls near the borderline. We then examine why text-based accuracy significantly lags behind scores from a review sentiment perspective, revealing the prevalence of the Politeness Principle: reviews of rejected papers still contain more positive than negative sentiment words, masking the true rejection signal and making it difficult for authors to judge outcomes from text alone.

Documenti analoghi