:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Yunyi, Jin, Craig
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2412.18710
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ICGAN: An implicit conditioning method for interpretable feature control of neural audio synthesis
by: Liu, Yunyi, et al.
Published: (2024)

Text2Move: Text-to-moving sound generation via trajectory prediction and temporal alignment
by: Liu, Yunyi, et al.
Published: (2025)

Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
by: Niu, Xinlei, et al.
Published: (2025)

The Neural-SRP method for positional sound source localization
by: Grinstein, Eric, et al.
Published: (2024)

Signal processing algorithm effective for sound quality of hearing loss simulators
by: Irino, Toshio, et al.
Published: (2024)

Serial-OE: Anomalous sound detection based on serial method with outlier exposure capable of using small amounts of anomalous data for training
by: Kuroyanagi, Ibuki, et al.
Published: (2025)

Adaptive high-precision sound source localization at low frequencies based on convolutional neural network
by: Ma, Wenbo, et al.
Published: (2024)

DNN-based ensemble singing voice synthesis with interactions between singers
by: Hyodo, Hiroaki, et al.
Published: (2024)

Efficient learning-based sound propagation for virtual and real-world audio processing applications
by: Ratnarajah, Anton Jeran
Published: (2024)

Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection
by: Yue, Haobo, et al.
Published: (2024)

Differentiable physics for sound field reconstruction
by: Verburg, Samuel A., et al.
Published: (2025)

Stereo sound event localization and detection based on PSELDnet pretraining and BiMamba sequence modeling
by: Gao, Wenmiao, et al.
Published: (2025)

Frequency-aware convolution for sound event detection
by: Song, Tao, et al.
Published: (2024)

On the influence of language similarity in non-target speaker verification trials
by: Reuter, Paul M., et al.
Published: (2025)

Investigation of perceptual music similarity focusing on each instrumental part
by: Hashizume, Yuka, et al.
Published: (2025)

Graph-based multi-Feature fusion method for speech emotion recognition
by: Liu, Xueyu, et al.
Published: (2024)

EvMic: Event-based Non-contact sound recovery from effective spatial-temporal modeling
by: Yin, Hao, et al.
Published: (2025)

Multizone sound field reproduction with direction-of-arrival-distribution-based regularization and its application to binaural-centered mode-matching
by: Matsuda, Ryo, et al.
Published: (2025)

STASE: A spatialized text-to-audio synthesis engine for music generation
by: Chi, Tutti, et al.
Published: (2025)

Some clues to build a sound analysis relevant to hearing
by: Millot, Laurent
Published: (2024)

Interaural time difference loss for binaural target sound extraction
by: Hernandez-Olivan, Carlos, et al.
Published: (2024)

Onset and offset weighted loss function for sound event detection
by: Song, Tao
Published: (2024)

Fine-tune the pretrained ATST model for sound event detection
by: Shao, Nian, et al.
Published: (2023)

A data-driven two-microphone method for in-situ sound absorption measurements
by: Emmerich, Leon, et al.
Published: (2025)

Representational learning for an anomalous sound detection system with source separation model
by: Shin, Seunghyeon, et al.
Published: (2024)

The role of direct sound spherical harmonics representation in externalization using binaural reproduction
by: Miller, Eran, et al.
Published: (2024)

Multispecies bird sound recognition using a fully convolutional neural network
by: García-Ordás, María Teresa, et al.
Published: (2024)

InsectSet459: an open dataset of insect sounds for bioacoustic machine learning
by: Faiß, Marius, et al.
Published: (2025)

In situ sound absorption estimation with the discrete complex image source method
by: Brandao, Eric, et al.
Published: (2024)

Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder
by: Li, Xuyuan, et al.
Published: (2023)

Binaural sound source localization using a hybrid time and frequency domain model
by: Geva, Gil, et al.
Published: (2024)

DGSNA: Dynamic Generative Scene-based Noise Addition method
by: Chen, Zihao, et al.
Published: (2024)

Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems
by: Ronchini, Francesca, et al.
Published: (2023)

Acousto-optic reconstruction of exterior sound field based on concentric circle sampling with circular harmonic expansion
by: Nguyen, Phuc Duc, et al.
Published: (2023)

Resnet-conformer network with shared weights and attention mechanism for sound event localization, detection, and distance estimation
by: Vo, Quoc Thinh, et al.
Published: (2025)

SongTrans: An unified song transcription and alignment method for lyrics and notes
by: Wu, Siwei, et al.
Published: (2024)

EZhouNet:A framework based on graph neural network and anchor interval for the respiratory sound event detection
by: Chu, Yun, et al.
Published: (2025)

Language model integration based on memory control for sequence to sequence speech recognition
by: Cho, Jaejin, et al.
Published: (2018)

Time-domain sound field estimation using kernel ridge regression
by: Brunnström, Jesper, et al.
Published: (2025)

Communication conditions in virtual acoustic scenes in an underground station
by: Hládek, Ľuboš, et al.
Published: (2021)