Saved in:
Bibliographic Details
Main Author: Dong, Stella C.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.08082
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917073091821568
author Dong, Stella C.
author_facet Dong, Stella C.
contents This paper develops a prudential framework for assessing the reliability of large language models (LLMs) in reinsurance. A five-pillar architecture--governance, data lineage, assurance, resilience, and regulatory alignment--translates supervisory expectations from Solvency II, SR 11-7, and guidance from EIOPA (2025), NAIC (2023), and IAIS (2024) into measurable lifecycle controls. The framework is implemented through the Reinsurance AI Reliability and Assurance Benchmark (RAIRAB), which evaluates whether governance-embedded LLMs meet prudential standards for grounding, transparency, and accountability. Across six task families, retrieval-grounded configurations achieved higher grounding accuracy (0.90), reduced hallucination and interpretive drift by roughly 40%, and nearly doubled transparency. These mechanisms lower informational frictions in risk transfer and capital allocation, showing that existing prudential doctrines already accommodate reliable AI when governance is explicit, data are traceable, and assurance is verifiable.
format Preprint
id arxiv_https___arxiv_org_abs_2511_08082
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Prudential Reliability of Large Language Models in Reinsurance: Governance, Assurance, and Capital Efficiency
Dong, Stella C.
Artificial Intelligence
Machine Learning
General Economics
Economics
91B30, 62P05, 68T07
I.2.7; J.1; G.3
This paper develops a prudential framework for assessing the reliability of large language models (LLMs) in reinsurance. A five-pillar architecture--governance, data lineage, assurance, resilience, and regulatory alignment--translates supervisory expectations from Solvency II, SR 11-7, and guidance from EIOPA (2025), NAIC (2023), and IAIS (2024) into measurable lifecycle controls. The framework is implemented through the Reinsurance AI Reliability and Assurance Benchmark (RAIRAB), which evaluates whether governance-embedded LLMs meet prudential standards for grounding, transparency, and accountability. Across six task families, retrieval-grounded configurations achieved higher grounding accuracy (0.90), reduced hallucination and interpretive drift by roughly 40%, and nearly doubled transparency. These mechanisms lower informational frictions in risk transfer and capital allocation, showing that existing prudential doctrines already accommodate reliable AI when governance is explicit, data are traceable, and assurance is verifiable.
title Prudential Reliability of Large Language Models in Reinsurance: Governance, Assurance, and Capital Efficiency
topic Artificial Intelligence
Machine Learning
General Economics
Economics
91B30, 62P05, 68T07
I.2.7; J.1; G.3
url https://arxiv.org/abs/2511.08082