Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Huang, Ji, Li, Mengfei, Shao, Shuai
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.21977
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911599701262336
author	Huang, Ji Li, Mengfei Shao, Shuai
author_facet	Huang, Ji Li, Mengfei Shao, Shuai
contents	Large language models (LLMs) offer a promising way to simulate human survey responses, potentially reducing the cost of large-scale data collection. However, existing zero-shot methods suffer from prompt sensitivity and low accuracy, while conventional fine-tuning approaches mostly fit the training set distributions and struggle to produce results more accurate than the training set itself, which deviates from the original goal of using LLMs to simulate survey responses. Building on this observation, we introduce Distribution Shift Alignment (DSA), a two-stage fine-tuning method that aligns both the output distributions and the distribution shifts across different backgrounds. By learning how these distributions change rather than fitting training data, DSA can provide results substantially closer to the true distribution than the training data. Empirically, DSA consistently outperforms other methods on five public survey datasets. We further conduct a comprehensive comparison covering accuracy, robustness, and data savings. DSA reduces the required real data by 53.48-69.12%, demonstrating its effectiveness and efficiency in survey simulation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_21977
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Distribution Shift Alignment Helps LLMs Simulate Survey Response Distributions Huang, Ji Li, Mengfei Shao, Shuai Artificial Intelligence Large language models (LLMs) offer a promising way to simulate human survey responses, potentially reducing the cost of large-scale data collection. However, existing zero-shot methods suffer from prompt sensitivity and low accuracy, while conventional fine-tuning approaches mostly fit the training set distributions and struggle to produce results more accurate than the training set itself, which deviates from the original goal of using LLMs to simulate survey responses. Building on this observation, we introduce Distribution Shift Alignment (DSA), a two-stage fine-tuning method that aligns both the output distributions and the distribution shifts across different backgrounds. By learning how these distributions change rather than fitting training data, DSA can provide results substantially closer to the true distribution than the training data. Empirically, DSA consistently outperforms other methods on five public survey datasets. We further conduct a comprehensive comparison covering accuracy, robustness, and data savings. DSA reduces the required real data by 53.48-69.12%, demonstrating its effectiveness and efficiency in survey simulation.
title	Distribution Shift Alignment Helps LLMs Simulate Survey Response Distributions
topic	Artificial Intelligence
url	https://arxiv.org/abs/2510.21977

Similar Items