Saved in:
Bibliographic Details
Main Authors: Qin, Zhenlin, Ling, Yancheng, Wang, Leizhen, Pereira, Francisco Câmara, Ma, Zhenliang
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.11569
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914501356421120
author Qin, Zhenlin
Ling, Yancheng
Wang, Leizhen
Pereira, Francisco Câmara
Ma, Zhenliang
author_facet Qin, Zhenlin
Ling, Yancheng
Wang, Leizhen
Pereira, Francisco Câmara
Ma, Zhenliang
contents Population synthesis is essential for individual-level simulation in transport planning and socio-economic analysis, yet remains challenging due to the need to capture both statistical dependencies and high-level behavioral semantics. Existing data-driven approaches predominantly rely on unconditional generation, limiting their ability to support scenario-driven or target-oriented population synthesis. This study proposes SemaPop, a semantic-conditioned and controllable population synthesis framework that introduces persona representations as conditioning signals for generation. By deriving persona text from survey data using large language models (LLMs) and encoding it into semantic embeddings, SemaPop enables controllable population generation under statistical constraints. We instantiate the framework using a GAN-based architecture with marginal regularization to preserve distributional consistency. Extensive experiments demonstrate that SemaPop substantially improves generative performance, yielding closer alignment with target marginal and joint distributions while maintaining sample-level feasibility and diversity under semantic conditioning. Counterfactual analyses further demonstrate that semantic interventions induce systematic and interpretable shifts in generated populations. These results highlight the potential of persona-based semantic conditioning for controllable and scenario-oriented population synthesis.
format Preprint
id arxiv_https___arxiv_org_abs_2602_11569
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle SemaPop: Semantic-Persona Conditioned and Controllable Population Synthesis
Qin, Zhenlin
Ling, Yancheng
Wang, Leizhen
Pereira, Francisco Câmara
Ma, Zhenliang
Artificial Intelligence
Population synthesis is essential for individual-level simulation in transport planning and socio-economic analysis, yet remains challenging due to the need to capture both statistical dependencies and high-level behavioral semantics. Existing data-driven approaches predominantly rely on unconditional generation, limiting their ability to support scenario-driven or target-oriented population synthesis. This study proposes SemaPop, a semantic-conditioned and controllable population synthesis framework that introduces persona representations as conditioning signals for generation. By deriving persona text from survey data using large language models (LLMs) and encoding it into semantic embeddings, SemaPop enables controllable population generation under statistical constraints. We instantiate the framework using a GAN-based architecture with marginal regularization to preserve distributional consistency. Extensive experiments demonstrate that SemaPop substantially improves generative performance, yielding closer alignment with target marginal and joint distributions while maintaining sample-level feasibility and diversity under semantic conditioning. Counterfactual analyses further demonstrate that semantic interventions induce systematic and interpretable shifts in generated populations. These results highlight the potential of persona-based semantic conditioning for controllable and scenario-oriented population synthesis.
title SemaPop: Semantic-Persona Conditioned and Controllable Population Synthesis
topic Artificial Intelligence
url https://arxiv.org/abs/2602.11569