:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Deloche, François, Thienpont, Morgan, Verhulst, Sarah
Format:	Preprint
Published:	2026
Subjects:	Audio and Speech Processing Biological Physics
Online Access:	https://arxiv.org/abs/2602.01758
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Acoustic characterization of speech rhythm: going beyond metrics with recurrent neural networks
by: Deloche, François, et al.
Published: (2024)

dCoNNear: An Artifact-Free Neural Network Architecture for Closed-loop Audio Signal Processing
by: Wen, Chuan, et al.
Published: (2025)

Speaker-Independent Acoustic-to-Articulatory Inversion through Multi-Channel Attention Discriminator
by: Chung, Woo-Jin, et al.
Published: (2024)

A computational loudness model for electrical stimulation with cochlear implants
by: Alvarez, Franklin, et al.
Published: (2025)

TokenSE: a Mamba-based discrete token speech enhancement framework for cochlear implants
by: Chiang, Hsin-Tien, et al.
Published: (2026)

Enhancing spatial hearing with cochlear implants: exploring the role of AI, multimodal interaction and perceptual training
by: Picinali, Lorenzo, et al.
Published: (2026)

Binaural sound source localization using a hybrid time and frequency domain model
by: Geva, Gil, et al.
Published: (2024)

Rethinking the joint estimation of magnitude and phase for time-frequency domain neural vocoders
by: Dai, Lingling, et al.
Published: (2025)

Tandem spoofing-robust automatic speaker verification based on time-domain embeddings
by: Weizman, Avishai, et al.
Published: (2024)

Progressive unsupervised domain adaptation for ASR using ensemble models and multi-stage training
by: Ahmad, Rehan, et al.
Published: (2024)

Directional reflection modeling via wavenumber-domain reflection coefficient for 3D acoustic field simulation
by: Hoshika, Satoshi, et al.
Published: (2026)

End-to-end audio-visual learning for cochlear implant sound coding simulations in noisy environments
by: Lin, Meng-Ping, et al.
Published: (2025)

Deep low-latency joint speech transmission and enhancement over a gaussian channel
by: Bokaei, Mohammad, et al.
Published: (2024)

Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in Arabic
by: Kheir, Yassine El, et al.
Published: (2024)

Short-Segment Speaker Verification with Pre-trained Models and Multi-Resolution Encoder
by: Myoung, Jisoo, et al.
Published: (2025)

Teaching the Teachers: Boosting unsupervised domain adaptation in speech recognition by ensemble update
by: Ahmad, Rehan, et al.
Published: (2026)

Analyzing long-term rhythm variations in Mising and Assamese using frequency domain correlates
by: Gogoi, Parismita, et al.
Published: (2024)

Adaptive Speaker Embedding Self-Augmentation for Personal Voice Activity Detection with Short Enrollment Speech
by: Feng, Fuyuan, et al.
Published: (2026)

Spatial Audio Signal Enhancement: A Multi-output MVDR Method in The Spherical Harmonic-domain
by: Zhang, Huawei, et al.
Published: (2024)

Benchmarking multi-component signal processing methods in the time-frequency plane
by: Miramont, Juan M., et al.
Published: (2024)

Cross-domain Neural Pitch and Periodicity Estimation
by: Morrison, Max, et al.
Published: (2023)

Real-time speech enhancement in noise for throat microphone using neural audio codec as foundation model
by: Hauret, Julien, et al.
Published: (2025)

Train Short, Infer Long: Speech-LLM Enables Zero-Shot Streamable Joint ASR and Diarization on Long Audio
by: Shi, Mohan, et al.
Published: (2025)

Improving Short Utterance Anti-Spoofing with AASIST2
by: Zhang, Yuxiang, et al.
Published: (2023)

Multi-Utterance Speech Separation and Association Trained on Short Segments
by: Wang, Yuzhu, et al.
Published: (2025)

Automotive sound field reproduction using deep optimization with spatial domain constraint
by: Qian, Yufan, et al.
Published: (2025)

WhisperFlow: speech foundation models in real time
by: Wang, Rongxiang, et al.
Published: (2024)

PlumberNet: Fixing interference leakage after GEV beamforming
by: Grondin, François, et al.
Published: (2023)

Why does music source separation benefit from cacophony?
by: Jeon, Chang-Bin, et al.
Published: (2024)

ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency
by: Chen, Yafeng, et al.
Published: (2024)

Acoustic source localization in the spherical harmonics domain exploiting low-rank approximations
by: Cobos, Maximo, et al.
Published: (2023)

Discrimination loss vs. SRT: A model-based approach towards harmonizing speech test interpretations
by: Buhl, Mareike, et al.
Published: (2025)

Passive acoustic non-line-of-sight localization without a relay surface
by: Sommer, Tal I., et al.
Published: (2025)

Modeling Intrapersonal and Interpersonal Influences for Automatic Estimation of Therapist Empathy in Counseling Conversation
by: Tao, Dehua, et al.
Published: (2023)

Learning Representation of Therapist Empathy in Counseling Conversation Using Siamese Hierarchical Attention Network
by: Tao, Dehua, et al.
Published: (2023)

The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement
by: Leglaive, Simon, et al.
Published: (2023)

Unsupervised Improved MVDR Beamforming for Sound Enhancement
by: Kealey, Jacob, et al.
Published: (2024)

Understanding the strengths and weaknesses of SSL models for audio deepfake model attribution
by: Pîrlogeanu, Gabriel, et al.
Published: (2026)

Modeling and Link Budget Feasibility Analysis of Secure LoRa-Based Peer-to-Peer Communication for Short-Range Tactical Networks
by: Agrawal, Ayush Kumar, et al.
Published: (2026)

Online incremental learning for audio classification using a pretrained audio model
by: Mulimani, Manjunath, et al.
Published: (2025)