Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Or, Barak
Format:	Preprint
Published:	2025
Subjects:	Multiagent Systems Artificial Intelligence Systems and Control
Online Access:	https://arxiv.org/abs/2511.20663
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915696030515200
author	Or, Barak
author_facet	Or, Barak
contents	Reliability in multi-agent systems (MAS) built on large language models is increasingly limited by cognitive failures rather than infrastructure faults. Existing observability tools describe failures but do not quantify how quickly distributed reasoning recovers once coherence is lost. We introduce MTTR-A (Mean Time-to-Recovery for Agentic Systems), a runtime reliability metric that measures cognitive recovery latency in MAS. MTTR-A adapts classical dependability theory to agentic orchestration, capturing the time required to detect reasoning drift and restore coherent operation. We further define complementary metrics, including MTBF and a normalized recovery ratio (NRR), and establish theoretical bounds linking recovery latency to long-run cognitive uptime. Using a LangGraph-based benchmark with simulated drift and reflex recovery, we empirically demonstrate measurable recovery behavior across multiple reflex strategies. This work establishes a quantitative foundation for runtime cognitive dependability in distributed agentic systems.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_20663
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	MTTR-A: Measuring Cognitive Recovery Latency in Multi-Agent Systems Or, Barak Multiagent Systems Artificial Intelligence Systems and Control Reliability in multi-agent systems (MAS) built on large language models is increasingly limited by cognitive failures rather than infrastructure faults. Existing observability tools describe failures but do not quantify how quickly distributed reasoning recovers once coherence is lost. We introduce MTTR-A (Mean Time-to-Recovery for Agentic Systems), a runtime reliability metric that measures cognitive recovery latency in MAS. MTTR-A adapts classical dependability theory to agentic orchestration, capturing the time required to detect reasoning drift and restore coherent operation. We further define complementary metrics, including MTBF and a normalized recovery ratio (NRR), and establish theoretical bounds linking recovery latency to long-run cognitive uptime. Using a LangGraph-based benchmark with simulated drift and reflex recovery, we empirically demonstrate measurable recovery behavior across multiple reflex strategies. This work establishes a quantitative foundation for runtime cognitive dependability in distributed agentic systems.
title	MTTR-A: Measuring Cognitive Recovery Latency in Multi-Agent Systems
topic	Multiagent Systems Artificial Intelligence Systems and Control
url	https://arxiv.org/abs/2511.20663

Similar Items