Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Dehghani, Zeinab, Kureshi, Rameez Raja, Aslansefat, Koorosh, Abedi, Faezeh Alsadat, Thakker, Dhavalkumar, Greaves, Lisa, Mishra, Bhupesh Kumar, Ahmad, Baseer, Maslekar, Tanaya
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2603.23625
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914420175667200
author	Dehghani, Zeinab Kureshi, Rameez Raja Aslansefat, Koorosh Abedi, Faezeh Alsadat Thakker, Dhavalkumar Greaves, Lisa Mishra, Bhupesh Kumar Ahmad, Baseer Maslekar, Tanaya
author_facet	Dehghani, Zeinab Kureshi, Rameez Raja Aslansefat, Koorosh Abedi, Faezeh Alsadat Thakker, Dhavalkumar Greaves, Lisa Mishra, Bhupesh Kumar Ahmad, Baseer Maslekar, Tanaya
contents	Artificial intelligence (AI) is increasingly being explored in health and social care to reduce administrative workload and allow staff to spend more time on patient care. This paper evaluates a voice-enabled Care Home Smart Speaker designed to support everyday activities in residential care homes, including spoken access to resident records, reminders, and scheduling tasks. A safety-focused evaluation framework is presented that examines the system end-to-end, combining Whisper-based speech recognition with retrieval-augmented generation (RAG) approaches (hybrid, sparse, and dense). Using supervised care-home trials and controlled testing, we evaluated 330 spoken transcripts across 11 care categories, including 184 reminder-containing interactions. These evaluations focus on (i) correct identification of residents and care categories, (ii) reminder recognition and extraction, and (iii) end-to-end scheduling correctness under uncertainty (including safe deferral/clarification). Given the safety-critical nature of care homes, particular attention is also paid to reliability in noisy environments and across diverse accents, supported by confidence scoring, clarification prompts, and human-in-the-loop oversight. In the best-performing configuration (GPT-5.2), resident ID and care category matching reached 100% (95% CI: 98.86-100), while reminder recognition reached 89.09\% (95% CI: 83.81-92.80) with zero missed reminders (100% recall) but some false positives. End-to-end scheduling via calendar integration achieved 84.65% exact reminder-count agreement (95% CI: 78.00-89.56), indicating remaining edge cases in converting informal spoken instructions into actionable events. The findings suggest that voice-enabled systems, when carefully evaluated and appropriately safeguarded, can support accurate documentation, effective task management, and trustworthy use of AI in care home settings.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_23625
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework Dehghani, Zeinab Kureshi, Rameez Raja Aslansefat, Koorosh Abedi, Faezeh Alsadat Thakker, Dhavalkumar Greaves, Lisa Mishra, Bhupesh Kumar Ahmad, Baseer Maslekar, Tanaya Artificial Intelligence Computation and Language Artificial intelligence (AI) is increasingly being explored in health and social care to reduce administrative workload and allow staff to spend more time on patient care. This paper evaluates a voice-enabled Care Home Smart Speaker designed to support everyday activities in residential care homes, including spoken access to resident records, reminders, and scheduling tasks. A safety-focused evaluation framework is presented that examines the system end-to-end, combining Whisper-based speech recognition with retrieval-augmented generation (RAG) approaches (hybrid, sparse, and dense). Using supervised care-home trials and controlled testing, we evaluated 330 spoken transcripts across 11 care categories, including 184 reminder-containing interactions. These evaluations focus on (i) correct identification of residents and care categories, (ii) reminder recognition and extraction, and (iii) end-to-end scheduling correctness under uncertainty (including safe deferral/clarification). Given the safety-critical nature of care homes, particular attention is also paid to reliability in noisy environments and across diverse accents, supported by confidence scoring, clarification prompts, and human-in-the-loop oversight. In the best-performing configuration (GPT-5.2), resident ID and care category matching reached 100% (95% CI: 98.86-100), while reminder recognition reached 89.09\% (95% CI: 83.81-92.80) with zero missed reminders (100% recall) but some false positives. End-to-end scheduling via calendar integration achieved 84.65% exact reminder-count agreement (95% CI: 78.00-89.56), indicating remaining edge cases in converting informal spoken instructions into actionable events. The findings suggest that voice-enabled systems, when carefully evaluated and appropriately safeguarded, can support accurate documentation, effective task management, and trustworthy use of AI in care home settings.
title	Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework
topic	Artificial Intelligence Computation and Language
url	https://arxiv.org/abs/2603.23625

Similar Items