Saved in:
Bibliographic Details
Main Authors: Hu, Xuyi, Li, Jian, Picinali, Lorenzo, Hogg, Aidan O. T.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.17586
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909998363181056
author Hu, Xuyi
Li, Jian
Picinali, Lorenzo
Hogg, Aidan O. T.
author_facet Hu, Xuyi
Li, Jian
Picinali, Lorenzo
Hogg, Aidan O. T.
contents The demand for realistic virtual immersive audio continues to grow, with Head-Related Transfer Functions (HRTFs) playing a key role. HRTFs capture how sound reaches our ears, reflecting unique anatomical features and enhancing spatial perception. It has been shown that personalized HRTFs improve localization accuracy, but their measurement remains time-consuming and requires a noise-free environment. Although machine learning has been shown to reduce the required measurement points and, thus, the measurement time, a controlled environment is still necessary. This paper proposes a method to address this constraint by presenting a novel technique that can upsample sparse, noisy HRTF measurements. The proposed approach combines an HRTF Denoisy U-Net for denoising and an Autoencoding Generative Adversarial Network (AE-GAN) for upsampling from three measurement points. The proposed method achieves a log-spectral distortion (LSD) error of 5.41 dB and a cosine similarity loss of 0.0070, demonstrating the method's effectiveness in HRTF upsampling.
format Preprint
id arxiv_https___arxiv_org_abs_2504_17586
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Machine Learning Approach for Denoising and Upsampling HRTFs
Hu, Xuyi
Li, Jian
Picinali, Lorenzo
Hogg, Aidan O. T.
Sound
Machine Learning
The demand for realistic virtual immersive audio continues to grow, with Head-Related Transfer Functions (HRTFs) playing a key role. HRTFs capture how sound reaches our ears, reflecting unique anatomical features and enhancing spatial perception. It has been shown that personalized HRTFs improve localization accuracy, but their measurement remains time-consuming and requires a noise-free environment. Although machine learning has been shown to reduce the required measurement points and, thus, the measurement time, a controlled environment is still necessary. This paper proposes a method to address this constraint by presenting a novel technique that can upsample sparse, noisy HRTF measurements. The proposed approach combines an HRTF Denoisy U-Net for denoising and an Autoencoding Generative Adversarial Network (AE-GAN) for upsampling from three measurement points. The proposed method achieves a log-spectral distortion (LSD) error of 5.41 dB and a cosine similarity loss of 0.0070, demonstrating the method's effectiveness in HRTF upsampling.
title A Machine Learning Approach for Denoising and Upsampling HRTFs
topic Sound
Machine Learning
url https://arxiv.org/abs/2504.17586