Saved in:
Bibliographic Details
Main Authors: Dirmeier, Simon, Hong, Ye, Perez-Cruz, Fernando
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.12242
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911779715547136
author Dirmeier, Simon
Hong, Ye
Perez-Cruz, Fernando
author_facet Dirmeier, Simon
Hong, Ye
Perez-Cruz, Fernando
contents Diffusion probabilistic models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data, for instance, for computer vision, audio, natural language processing, or biomolecule generation. Here, we propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals. ILTs are of major importance in mobility research to understand the mobility behavior of populations and to ultimately inform political decision-making. We represent ILTs as multi-dimensional categorical random variables and propose to model their joint distribution using a continuous DPM by first applying the diffusion process in a continuous unconstrained space and then mapping the continuous variables into a discrete space. We demonstrate that our model can synthesize realistic ILPs by comparing conditionally and unconditionally generated sequences to real-world ILPs from a GNSS tracking data set which suggests the potential use of our model for synthetic data generation, for example, for benchmarking models used in mobility research.
format Preprint
id arxiv_https___arxiv_org_abs_2402_12242
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Synthetic location trajectory generation using categorical diffusion models
Dirmeier, Simon
Hong, Ye
Perez-Cruz, Fernando
Machine Learning
Diffusion probabilistic models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data, for instance, for computer vision, audio, natural language processing, or biomolecule generation. Here, we propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals. ILTs are of major importance in mobility research to understand the mobility behavior of populations and to ultimately inform political decision-making. We represent ILTs as multi-dimensional categorical random variables and propose to model their joint distribution using a continuous DPM by first applying the diffusion process in a continuous unconstrained space and then mapping the continuous variables into a discrete space. We demonstrate that our model can synthesize realistic ILPs by comparing conditionally and unconditionally generated sequences to real-world ILPs from a GNSS tracking data set which suggests the potential use of our model for synthetic data generation, for example, for benchmarking models used in mobility research.
title Synthetic location trajectory generation using categorical diffusion models
topic Machine Learning
url https://arxiv.org/abs/2402.12242