Saved in:
Bibliographic Details
Main Authors: Albrecht, Jens, Lehmann, Robert, Poltermann, Aleksandra, Rudolph, Eric, Steigerwald, Philipp, Stieler, Mara
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2512.09804
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915895979278336
author Albrecht, Jens
Lehmann, Robert
Poltermann, Aleksandra
Rudolph, Eric
Steigerwald, Philipp
Stieler, Mara
author_facet Albrecht, Jens
Lehmann, Robert
Poltermann, Aleksandra
Rudolph, Eric
Steigerwald, Philipp
Stieler, Mara
contents This paper presents OnCoCo 1.0, a new public dataset for fine-grained message classification in online counseling. It is based on a new, integrative system of categories, designed to improve the automated analysis of psychosocial online counseling conversations. Existing category systems, predominantly based on Motivational Interviewing (MI), are limited by their narrow focus and dependence on datasets derived mainly from face-to-face counseling. This limits the detailed examination of textual counseling conversations. In response, we developed a comprehensive new coding scheme that differentiates between 38 types of counselor and 28 types of client utterances, and created a labeled dataset consisting of about 2.800 messages from counseling conversations. We fine-tuned several models on our dataset to demonstrate its applicability. The data and models are publicly available to researchers and practitioners. Thus, our work contributes a new type of fine-grained conversational resource to the language resources community, extending existing datasets for social and mental-health dialogue analysis.
format Preprint
id arxiv_https___arxiv_org_abs_2512_09804
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle OnCoCo 1.0: A Public Dataset for Fine-Grained Message Classification in Online Counseling Conversations
Albrecht, Jens
Lehmann, Robert
Poltermann, Aleksandra
Rudolph, Eric
Steigerwald, Philipp
Stieler, Mara
Computation and Language
Machine Learning
This paper presents OnCoCo 1.0, a new public dataset for fine-grained message classification in online counseling. It is based on a new, integrative system of categories, designed to improve the automated analysis of psychosocial online counseling conversations. Existing category systems, predominantly based on Motivational Interviewing (MI), are limited by their narrow focus and dependence on datasets derived mainly from face-to-face counseling. This limits the detailed examination of textual counseling conversations. In response, we developed a comprehensive new coding scheme that differentiates between 38 types of counselor and 28 types of client utterances, and created a labeled dataset consisting of about 2.800 messages from counseling conversations. We fine-tuned several models on our dataset to demonstrate its applicability. The data and models are publicly available to researchers and practitioners. Thus, our work contributes a new type of fine-grained conversational resource to the language resources community, extending existing datasets for social and mental-health dialogue analysis.
title OnCoCo 1.0: A Public Dataset for Fine-Grained Message Classification in Online Counseling Conversations
topic Computation and Language
Machine Learning
url https://arxiv.org/abs/2512.09804