Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ng, Yee Man, van Dijk, Bram, Beynen, Pieter, Boekesteijn, Otto, Jansen, Joris, van Oortmerssen, Gerard, van Duijn, Max, Spruit, Marco
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2603.08392
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910046394253312
author	Ng, Yee Man van Dijk, Bram Beynen, Pieter Boekesteijn, Otto Jansen, Joris van Oortmerssen, Gerard van Duijn, Max Spruit, Marco
author_facet	Ng, Yee Man van Dijk, Bram Beynen, Pieter Boekesteijn, Otto Jansen, Joris van Oortmerssen, Gerard van Duijn, Max Spruit, Marco
contents	Systems that collect data on sleep, mood, and activities can provide valuable lifestyle counselling to populations affected by chronic disease and its consequences. Such systems are, however, challenging to develop; besides reliably extracting patterns from user-specific data, systems should also contextualise these patterns with validated medical knowledge to ensure the quality of counselling, and generate counselling that is relevant to a real user. We present QUORUM, a new evaluation framework that unifies these developer-, expert-, and user-centric perspectives, and show with a real case study that it meaningfully tracks convergence and divergence in stakeholder perspectives. We also present COACH, a Large Language Model-driven pipeline to generate personalised lifestyle counselling for our Healthy Chronos use case, a diary app for cancer patients and survivors. Applying our framework shows that overall, users, medical experts, and developers converge on the opinion that the generated counselling is relevant, of good quality, and reliable. However, stakeholders also diverge on the tone of the counselling, sensitivity to errors in pattern-extraction, and potential hallucinations. These findings highlight the importance of multi-stakeholder evaluation for consumer health language technologies and illustrate how a unified evaluation framework can support trustworthy, patient-centered NLP systems in real-world settings.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_08392
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	COACH meets QUORUM: A Framework and Pipeline for Aligning User, Expert and Developer Perspectives in LLM-generated Health Counselling Ng, Yee Man van Dijk, Bram Beynen, Pieter Boekesteijn, Otto Jansen, Joris van Oortmerssen, Gerard van Duijn, Max Spruit, Marco Computation and Language Systems that collect data on sleep, mood, and activities can provide valuable lifestyle counselling to populations affected by chronic disease and its consequences. Such systems are, however, challenging to develop; besides reliably extracting patterns from user-specific data, systems should also contextualise these patterns with validated medical knowledge to ensure the quality of counselling, and generate counselling that is relevant to a real user. We present QUORUM, a new evaluation framework that unifies these developer-, expert-, and user-centric perspectives, and show with a real case study that it meaningfully tracks convergence and divergence in stakeholder perspectives. We also present COACH, a Large Language Model-driven pipeline to generate personalised lifestyle counselling for our Healthy Chronos use case, a diary app for cancer patients and survivors. Applying our framework shows that overall, users, medical experts, and developers converge on the opinion that the generated counselling is relevant, of good quality, and reliable. However, stakeholders also diverge on the tone of the counselling, sensitivity to errors in pattern-extraction, and potential hallucinations. These findings highlight the importance of multi-stakeholder evaluation for consumer health language technologies and illustrate how a unified evaluation framework can support trustworthy, patient-centered NLP systems in real-world settings.
title	COACH meets QUORUM: A Framework and Pipeline for Aligning User, Expert and Developer Perspectives in LLM-generated Health Counselling
topic	Computation and Language
url	https://arxiv.org/abs/2603.08392

Similar Items