Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Cai, Xiaoran, Yang, Wang, Ren, Xiyu, Law, Chekun, Sharma, Rohit, Qi, Peng
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.17106
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910026771202048
author	Cai, Xiaoran Yang, Wang Ren, Xiyu Law, Chekun Sharma, Rohit Qi, Peng
author_facet	Cai, Xiaoran Yang, Wang Ren, Xiyu Law, Chekun Sharma, Rohit Qi, Peng
contents	Sustainability or ESG rating agencies use company disclosures and external data to produce scores or ratings that assess the environmental, social, and governance performance of a company. However, sustainability ratings across agencies for a single company vary widely, limiting their comparability, credibility, and relevance to decision-making. To harmonize the rating results, we propose adopting a universal human-AI collaboration framework to generate trustworthy benchmark datasets for evaluating sustainability rating methodologies. The framework comprises two complementary parts: STRIDE (Sustainability Trust Rating & Integrity Data Equation) provides principled criteria and a scoring system that guide the construction of firm-level benchmark datasets using large language models (LLMs), and SR-Delta, a discrepancy-analysis procedural framework that surfaces insights for potential adjustments. The framework enables scalable and comparable assessment of sustainability rating methodologies. We call on the broader AI community to adopt AI-powered approaches to strengthen and advance sustainability rating methodologies that support and enforce urgent sustainability agendas.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_17106
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Toward Trustworthy Evaluation of Sustainability Rating Methodologies: A Human-AI Collaborative Framework for Benchmark Dataset Construction Cai, Xiaoran Yang, Wang Ren, Xiyu Law, Chekun Sharma, Rohit Qi, Peng Artificial Intelligence Sustainability or ESG rating agencies use company disclosures and external data to produce scores or ratings that assess the environmental, social, and governance performance of a company. However, sustainability ratings across agencies for a single company vary widely, limiting their comparability, credibility, and relevance to decision-making. To harmonize the rating results, we propose adopting a universal human-AI collaboration framework to generate trustworthy benchmark datasets for evaluating sustainability rating methodologies. The framework comprises two complementary parts: STRIDE (Sustainability Trust Rating & Integrity Data Equation) provides principled criteria and a scoring system that guide the construction of firm-level benchmark datasets using large language models (LLMs), and SR-Delta, a discrepancy-analysis procedural framework that surfaces insights for potential adjustments. The framework enables scalable and comparable assessment of sustainability rating methodologies. We call on the broader AI community to adopt AI-powered approaches to strengthen and advance sustainability rating methodologies that support and enforce urgent sustainability agendas.
title	Toward Trustworthy Evaluation of Sustainability Rating Methodologies: A Human-AI Collaborative Framework for Benchmark Dataset Construction
topic	Artificial Intelligence
url	https://arxiv.org/abs/2602.17106

Similar Items