Saved in:
Bibliographic Details
Main Authors: Wang, Jiaqing, Yang, Zhongfang, Zhu, Xingyuan, Huang, Zong'an, Wang, Hao, Tian, Li, Cao, Ying, Qu, Xiaomin, Qi, Xiang, Wu, Bei, Zhu, Zheng
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.16343
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917417233416192
author Wang, Jiaqing
Yang, Zhongfang
Zhu, Xingyuan
Huang, Zong'an
Wang, Hao
Tian, Li
Cao, Ying
Qu, Xiaomin
Qi, Xiang
Wu, Bei
Zhu, Zheng
author_facet Wang, Jiaqing
Yang, Zhongfang
Zhu, Xingyuan
Huang, Zong'an
Wang, Hao
Tian, Li
Cao, Ying
Qu, Xiaomin
Qi, Xiang
Wu, Bei
Zhu, Zheng
contents Background: LLMs enable patient-facing conversational agents, creating a pathway toward digital twins that capture older adults' lived experiences and behavioral responses across time. A central barrier is personality drift -- inconsistent trait expression across repeated interactions -- which undermines reliability of generated trajectories and intervention-response simulation in geriatric care. Objective: To develop ELDER-SIM, a multi-role elderly-care conversational platform for building personality-stable digital twin agents, and to propose a psychometric validation framework for quantifying personality consistency in LLM-based agents. Methods: ELDER-SIM was implemented via n8n workflow orchestration with local LLM inference (Ollama/vLLM), integrating (1) Big Five (OCEAN) trait specifications, (2) a Cognitive Conceptualization Diagram (CCD) grounded in Beck's CBT framework, and (3) a MySQL-based long-term memory module. Ablation studies across four conditions -- Baseline, +Memory, +CCD, and +LoRA (fine-tuned on 19,717 instruction pairs from CHARLS) -- were evaluated via Cronbach's $α$, ICC, and role discrimination accuracy. Results: Reliability was acceptable to excellent across conditions (Cronbach's $α$: 0.70--0.94; ICC: 0.85--0.96). Role discrimination improved from 83.3% (Baseline) to 88.9% (+Memory), 94.4% (+CCD), and 97.2% (+LoRA). CCD produced the largest consistency gain (mean $α$ 0.702$\to$0.892), while LoRA achieved the highest overall consistency ($α$ 0.940; ICC 0.958). Conclusions: ELDER-SIM provides a psychometrically validated approach for constructing personality-consistent elderly digital twin agents. Structured cognitive modeling and domain adaptation reduce personality drift, supporting reliable longitudinal simulation for elderly mental health care and reproducible in silico evaluation before clinical deployment.
format Preprint
id arxiv_https___arxiv_org_abs_2604_16343
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Elder-Sim: A Psychometrically Validated Platform for Personality-Stable Elderly Digital Twins
Wang, Jiaqing
Yang, Zhongfang
Zhu, Xingyuan
Huang, Zong'an
Wang, Hao
Tian, Li
Cao, Ying
Qu, Xiaomin
Qi, Xiang
Wu, Bei
Zhu, Zheng
Human-Computer Interaction
Artificial Intelligence
Background: LLMs enable patient-facing conversational agents, creating a pathway toward digital twins that capture older adults' lived experiences and behavioral responses across time. A central barrier is personality drift -- inconsistent trait expression across repeated interactions -- which undermines reliability of generated trajectories and intervention-response simulation in geriatric care. Objective: To develop ELDER-SIM, a multi-role elderly-care conversational platform for building personality-stable digital twin agents, and to propose a psychometric validation framework for quantifying personality consistency in LLM-based agents. Methods: ELDER-SIM was implemented via n8n workflow orchestration with local LLM inference (Ollama/vLLM), integrating (1) Big Five (OCEAN) trait specifications, (2) a Cognitive Conceptualization Diagram (CCD) grounded in Beck's CBT framework, and (3) a MySQL-based long-term memory module. Ablation studies across four conditions -- Baseline, +Memory, +CCD, and +LoRA (fine-tuned on 19,717 instruction pairs from CHARLS) -- were evaluated via Cronbach's $α$, ICC, and role discrimination accuracy. Results: Reliability was acceptable to excellent across conditions (Cronbach's $α$: 0.70--0.94; ICC: 0.85--0.96). Role discrimination improved from 83.3% (Baseline) to 88.9% (+Memory), 94.4% (+CCD), and 97.2% (+LoRA). CCD produced the largest consistency gain (mean $α$ 0.702$\to$0.892), while LoRA achieved the highest overall consistency ($α$ 0.940; ICC 0.958). Conclusions: ELDER-SIM provides a psychometrically validated approach for constructing personality-consistent elderly digital twin agents. Structured cognitive modeling and domain adaptation reduce personality drift, supporting reliable longitudinal simulation for elderly mental health care and reproducible in silico evaluation before clinical deployment.
title Elder-Sim: A Psychometrically Validated Platform for Personality-Stable Elderly Digital Twins
topic Human-Computer Interaction
Artificial Intelligence
url https://arxiv.org/abs/2604.16343