Guardado en:
Detalles Bibliográficos
Autores principales: Ma, Xinhang, Chen, Sirui
Formato: Preprint
Publicado: 2024
Materias:
Acceso en línea:https://arxiv.org/abs/2403.05586
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866914708242563072
author Ma, Xinhang
Chen, Sirui
author_facet Ma, Xinhang
Chen, Sirui
contents Smart home voice assistants enable users to conveniently interact with IoT devices and perform Internet searches; however, they also collect the voice input that can carry sensitive personal information about users. Previous papers investigated how information inferred from the contents of users' voice commands are shared or leaked for tracking and advertising purposes. In this paper, we systematically evaluate how voice itself is used for user profiling in the Google ecosystem. To do so, we simulate various user personas by engaging with specific categories of websites. We then use \textit{neutral voice commands}, which we define as voice commands that neither reveal personal interests nor require Google smart speakers to use the search APIs, to interact with these speakers. We also explore the effects of the non-neutral voice commands for user profiling. Notably, we employ voices that typically would not match the predefined personas. We then iteratively improve our experiments based on observations of profile changes to better simulate real-world user interactions with smart speakers. We find that Google uses these voice recordings for user profiling, and in some cases, up to 5 out of the 8 categories reported by Google for customizing advertisements are altered following the collection of the voice commands.
format Preprint
id arxiv_https___arxiv_org_abs_2403_05586
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle From Speech to Data: Unraveling Google's Use of Voice Data for User Profiling
Ma, Xinhang
Chen, Sirui
Human-Computer Interaction
Smart home voice assistants enable users to conveniently interact with IoT devices and perform Internet searches; however, they also collect the voice input that can carry sensitive personal information about users. Previous papers investigated how information inferred from the contents of users' voice commands are shared or leaked for tracking and advertising purposes. In this paper, we systematically evaluate how voice itself is used for user profiling in the Google ecosystem. To do so, we simulate various user personas by engaging with specific categories of websites. We then use \textit{neutral voice commands}, which we define as voice commands that neither reveal personal interests nor require Google smart speakers to use the search APIs, to interact with these speakers. We also explore the effects of the non-neutral voice commands for user profiling. Notably, we employ voices that typically would not match the predefined personas. We then iteratively improve our experiments based on observations of profile changes to better simulate real-world user interactions with smart speakers. We find that Google uses these voice recordings for user profiling, and in some cases, up to 5 out of the 8 categories reported by Google for customizing advertisements are altered following the collection of the voice commands.
title From Speech to Data: Unraveling Google's Use of Voice Data for User Profiling
topic Human-Computer Interaction
url https://arxiv.org/abs/2403.05586