Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Furmakiewicz, Michal, Liu, Chang, Taylor, Angus, Venger, Ilya
Format:	Preprint
Published:	2024
Subjects:	Human-Computer Interaction Artificial Intelligence
Online Access:	https://arxiv.org/abs/2407.09512
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929419478630400
author	Furmakiewicz, Michal Liu, Chang Taylor, Angus Venger, Ilya
author_facet	Furmakiewicz, Michal Liu, Chang Taylor, Angus Venger, Ilya
contents	Building a successful AI copilot requires a systematic approach. This paper is divided into two sections, covering the design and evaluation of a copilot respectively. A case study of developing copilot templates for the retail domain by Microsoft is used to illustrate the role and importance of each aspect. The first section explores the key technical components of a copilot's architecture, including the LLM, plugins for knowledge retrieval and actions, orchestration, system prompts, and responsible AI guardrails. The second section discusses testing and evaluation as a principled way to promote desired outcomes and manage unintended consequences when using AI in a business context. We discuss how to measure and improve its quality and safety, through the lens of an end-to-end human-AI decision loop framework. By providing insights into the anatomy of a copilot and the critical aspects of testing and evaluation, this paper provides concrete evidence of how good design and evaluation practices are essential for building effective, human-centered AI assistants.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_09512
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Design and evaluation of AI copilots -- case studies of retail copilot templates Furmakiewicz, Michal Liu, Chang Taylor, Angus Venger, Ilya Human-Computer Interaction Artificial Intelligence Building a successful AI copilot requires a systematic approach. This paper is divided into two sections, covering the design and evaluation of a copilot respectively. A case study of developing copilot templates for the retail domain by Microsoft is used to illustrate the role and importance of each aspect. The first section explores the key technical components of a copilot's architecture, including the LLM, plugins for knowledge retrieval and actions, orchestration, system prompts, and responsible AI guardrails. The second section discusses testing and evaluation as a principled way to promote desired outcomes and manage unintended consequences when using AI in a business context. We discuss how to measure and improve its quality and safety, through the lens of an end-to-end human-AI decision loop framework. By providing insights into the anatomy of a copilot and the critical aspects of testing and evaluation, this paper provides concrete evidence of how good design and evaluation practices are essential for building effective, human-centered AI assistants.
title	Design and evaluation of AI copilots -- case studies of retail copilot templates
topic	Human-Computer Interaction Artificial Intelligence
url	https://arxiv.org/abs/2407.09512

Similar Items