Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Karani, Jash, Chittem, Adithya, Roy, Deepan, Joshi, Sandeep
Formato:	Preprint
Publicado:	2026
Materias:	Sound Machine Learning
Acceso en línea:	https://arxiv.org/abs/2602.22431
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866914352127279104
author	Karani, Jash Chittem, Adithya Roy, Deepan Joshi, Sandeep
author_facet	Karani, Jash Chittem, Adithya Roy, Deepan Joshi, Sandeep
contents	Millimeter-wave (mmWave) radar captures are band-limited and noisy, making for difficult reconstruction of intelligible full-bandwidth speech. In this work, we propose a two-stage speech reconstruction pipeline for mmWave using a Radar-Aware Dual-conditioned Generative Adversarial Network (RAD-GAN), which is capable of performing bandwidth extension on signals with low signal-to-noise ratios (-5 dB to -1 dB), captured through glass walls. We propose an mmWave-tailored Multi-Mel Discriminator (MMD) and a Residual Fusion Gate (RFG) to enhance the generator input to process multiple conditioning channels. The proposed two-stage pipeline involves pretraining the model on synthetically clipped clean speech and finetuning on fused mel spectrograms generated by the RFG. We empirically show that the proposed method, trained on a limited dataset, with no pre-trained modules, and no data augmentations, outperformed state-of-the-art approaches for this specific task. Audio examples of RAD-GAN are available online at https://rad-gan-demo-site.vercel.app/.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_22431
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	mmWave Radar Aware Dual-Conditioned GAN for Speech Reconstruction of Signals With Low SNR Karani, Jash Chittem, Adithya Roy, Deepan Joshi, Sandeep Sound Machine Learning Millimeter-wave (mmWave) radar captures are band-limited and noisy, making for difficult reconstruction of intelligible full-bandwidth speech. In this work, we propose a two-stage speech reconstruction pipeline for mmWave using a Radar-Aware Dual-conditioned Generative Adversarial Network (RAD-GAN), which is capable of performing bandwidth extension on signals with low signal-to-noise ratios (-5 dB to -1 dB), captured through glass walls. We propose an mmWave-tailored Multi-Mel Discriminator (MMD) and a Residual Fusion Gate (RFG) to enhance the generator input to process multiple conditioning channels. The proposed two-stage pipeline involves pretraining the model on synthetically clipped clean speech and finetuning on fused mel spectrograms generated by the RFG. We empirically show that the proposed method, trained on a limited dataset, with no pre-trained modules, and no data augmentations, outperformed state-of-the-art approaches for this specific task. Audio examples of RAD-GAN are available online at https://rad-gan-demo-site.vercel.app/.
title	mmWave Radar Aware Dual-Conditioned GAN for Speech Reconstruction of Signals With Low SNR
topic	Sound Machine Learning
url	https://arxiv.org/abs/2602.22431

Ejemplares similares