Saved in:
Bibliographic Details
Main Authors: Park, Jin Hyun, Ayati, Seyyed Ali, Cai, Yichen
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.09782
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916620028346368
author Park, Jin Hyun
Ayati, Seyyed Ali
Cai, Yichen
author_facet Park, Jin Hyun
Ayati, Seyyed Ali
Cai, Yichen
contents The increasing prevalence of microphones in everyday devices and the growing reliance on online services have amplified the risk of acoustic side-channel attacks (ASCAs) targeting keyboards. This study explores deep learning techniques, specifically vision transformers (VTs) and large language models (LLMs), to enhance the effectiveness and applicability of such attacks. We present substantial improvements over prior research, with the CoAtNet model achieving state-of-the-art performance. Our CoAtNet shows a 5.0% improvement for keystrokes recorded via smartphone (Phone) and 5.9% for those recorded via Zoom compared to previous benchmarks. We also evaluate transformer architectures and language models, with the best VT model matching CoAtNet's performance. A key advancement is the introduction of a noise mitigation method for real-world scenarios. By using LLMs for contextual understanding, we detect and correct erroneous keystrokes in noisy environments, enhancing ASCA performance. Additionally, fine-tuned lightweight language models with Low-Rank Adaptation (LoRA) deliver comparable performance to heavyweight models with 67X more parameters. This integration of VTs and LLMs improves the practical applicability of ASCA mitigation, marking the first use of these technologies to address ASCAs and error correction in real-world scenarios.
format Preprint
id arxiv_https___arxiv_org_abs_2502_09782
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Improving Acoustic Side-Channel Attacks on Keyboards Using Transformers and Large Language Models
Park, Jin Hyun
Ayati, Seyyed Ali
Cai, Yichen
Machine Learning
Artificial Intelligence
Computation and Language
Audio and Speech Processing
The increasing prevalence of microphones in everyday devices and the growing reliance on online services have amplified the risk of acoustic side-channel attacks (ASCAs) targeting keyboards. This study explores deep learning techniques, specifically vision transformers (VTs) and large language models (LLMs), to enhance the effectiveness and applicability of such attacks. We present substantial improvements over prior research, with the CoAtNet model achieving state-of-the-art performance. Our CoAtNet shows a 5.0% improvement for keystrokes recorded via smartphone (Phone) and 5.9% for those recorded via Zoom compared to previous benchmarks. We also evaluate transformer architectures and language models, with the best VT model matching CoAtNet's performance. A key advancement is the introduction of a noise mitigation method for real-world scenarios. By using LLMs for contextual understanding, we detect and correct erroneous keystrokes in noisy environments, enhancing ASCA performance. Additionally, fine-tuned lightweight language models with Low-Rank Adaptation (LoRA) deliver comparable performance to heavyweight models with 67X more parameters. This integration of VTs and LLMs improves the practical applicability of ASCA mitigation, marking the first use of these technologies to address ASCAs and error correction in real-world scenarios.
title Improving Acoustic Side-Channel Attacks on Keyboards Using Transformers and Large Language Models
topic Machine Learning
Artificial Intelligence
Computation and Language
Audio and Speech Processing
url https://arxiv.org/abs/2502.09782