Saved in:
Bibliographic Details
Main Authors: Chuang, Marianne, Chuang, Gabriel, Chuang, Cheryl, Chuang, John
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.15094
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909621741944832
author Chuang, Marianne
Chuang, Gabriel
Chuang, Cheryl
Chuang, John
author_facet Chuang, Marianne
Chuang, Gabriel
Chuang, Cheryl
Chuang, John
contents We study the use of large language models (LLMs) to both evaluate and greenwash corporate climate disclosures. First, we investigate the use of the LLM-as-a-Judge (LLMJ) methodology for scoring company-submitted reports on emissions reduction targets and progress. Second, we probe the behavior of an LLM when it is prompted to greenwash a response subject to accuracy and length constraints. Finally, we test the robustness of the LLMJ methodology against responses that may be greenwashed using an LLM. We find that two LLMJ scoring systems, numerical rating and pairwise comparison, are effective in distinguishing high-performing companies from others, with the pairwise comparison system showing greater robustness against LLM-greenwashed responses.
format Preprint
id arxiv_https___arxiv_org_abs_2502_15094
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Judging It, Washing It: Scoring and Greenwashing Corporate Climate Disclosures using Large Language Models
Chuang, Marianne
Chuang, Gabriel
Chuang, Cheryl
Chuang, John
Computation and Language
Applications
We study the use of large language models (LLMs) to both evaluate and greenwash corporate climate disclosures. First, we investigate the use of the LLM-as-a-Judge (LLMJ) methodology for scoring company-submitted reports on emissions reduction targets and progress. Second, we probe the behavior of an LLM when it is prompted to greenwash a response subject to accuracy and length constraints. Finally, we test the robustness of the LLMJ methodology against responses that may be greenwashed using an LLM. We find that two LLMJ scoring systems, numerical rating and pairwise comparison, are effective in distinguishing high-performing companies from others, with the pairwise comparison system showing greater robustness against LLM-greenwashed responses.
title Judging It, Washing It: Scoring and Greenwashing Corporate Climate Disclosures using Large Language Models
topic Computation and Language
Applications
url https://arxiv.org/abs/2502.15094