Saved in:
| Main Authors: | Hauret, Julien, Olivier, Malo, Joubaud, Thomas, Langrenne, Christophe, Poirée, Sarah, Zimpfer, Véronique, Bavu, Éric |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.11828 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
French Listening Tests for the Assessment of Intelligibility, Quality, and Identity of Body-Conducted Speech Enhancement
by: Joubaud, Thomas, et al.
Published: (2025)
by: Joubaud, Thomas, et al.
Published: (2025)
Configurable EBEN: Extreme Bandwidth Extension Network to enhance body-conducted speech capture
by: Hauret, Julien, et al.
Published: (2023)
by: Hauret, Julien, et al.
Published: (2023)
EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
by: Hauret, Julien, et al.
Published: (2022)
by: Hauret, Julien, et al.
Published: (2022)
Real-time speech enhancement in noise for throat microphone using neural audio codec as foundation model
by: Hauret, Julien, et al.
Published: (2025)
by: Hauret, Julien, et al.
Published: (2025)
Bringing Interpretability to Neural Audio Codecs
by: Sadok, Samir, et al.
Published: (2025)
by: Sadok, Samir, et al.
Published: (2025)
spINAch: A Diachronic Corpus of French Broadcast Speech Controlled for Speakers' Age and Gender
by: Devauchelle, Simon, et al.
Published: (2026)
by: Devauchelle, Simon, et al.
Published: (2026)
VoxEffects: A Speech-Oriented Audio Effects Dataset and Benchmark
by: Zhang, Zhe, et al.
Published: (2026)
by: Zhang, Zhe, et al.
Published: (2026)
YODAS: Youtube-Oriented Dataset for Audio and Speech
by: Li, Xinjian, et al.
Published: (2024)
by: Li, Xinjian, et al.
Published: (2024)
Benchmarking Large Pretrained Multilingual Models on Québec French Speech Recognition
by: Serrand, Coralie, et al.
Published: (2025)
by: Serrand, Coralie, et al.
Published: (2025)
Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts
by: Pelloin, Valentin, et al.
Published: (2026)
by: Pelloin, Valentin, et al.
Published: (2026)
The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection
by: Bibbó, Gabriel, et al.
Published: (2024)
by: Bibbó, Gabriel, et al.
Published: (2024)
CUEMPATHY: A Counseling Speech Dataset for Psychotherapy Research
by: Tao, Dehua, et al.
Published: (2024)
by: Tao, Dehua, et al.
Published: (2024)
CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset
by: Chen, Xuanjun, et al.
Published: (2025)
by: Chen, Xuanjun, et al.
Published: (2025)
ODAQ: Open Dataset of Audio Quality
by: Torcoli, Matteo, et al.
Published: (2023)
by: Torcoli, Matteo, et al.
Published: (2023)
Audio-Visual Speech Enhancement for Spatial Audio - Spatial-VisualVoice and the MAVE Database
by: Yaffe, Danielle, et al.
Published: (2025)
by: Yaffe, Danielle, et al.
Published: (2025)
WaLi: Can Pressure Sensors in HVAC Systems Capture Human Speech?
by: Tamiti, Tarikul Islam, et al.
Published: (2025)
by: Tamiti, Tarikul Islam, et al.
Published: (2025)
Interpreting the Role of Visemes in Audio-Visual Speech Recognition
by: Papadopoulos, Aristeidis, et al.
Published: (2025)
by: Papadopoulos, Aristeidis, et al.
Published: (2025)
A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition
by: de Groot, Dimme, et al.
Published: (2026)
by: de Groot, Dimme, et al.
Published: (2026)
Cross-linguistic Prosodic Analysis of Autistic and Non-autistic Child Speech in Finnish, French and Slovak
by: Myllylä, Ida-Lotta, et al.
Published: (2026)
by: Myllylä, Ida-Lotta, et al.
Published: (2026)
AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language Models
by: Bai, Jisheng, et al.
Published: (2024)
by: Bai, Jisheng, et al.
Published: (2024)
Speech Separation using Neural Audio Codecs with Embedding Loss
by: Yip, Jia Qi, et al.
Published: (2024)
by: Yip, Jia Qi, et al.
Published: (2024)
ULTRAS -- Unified Learning of Transformer Representations for Audio and Speech Signals
by: E, Ameenudeen P, et al.
Published: (2026)
by: E, Ameenudeen P, et al.
Published: (2026)
Classification of Autistic and Non-Autistic Children's Speech: A Cross-Linguistic Study in Finnish, French, and Slovak
by: Kakouros, Sofoklis, et al.
Published: (2026)
by: Kakouros, Sofoklis, et al.
Published: (2026)
Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality
by: Dick, Sascha, et al.
Published: (2025)
by: Dick, Sascha, et al.
Published: (2025)
Enhancing Crowdsourced Audio for Text-to-Speech Models
by: Giraldo, José, et al.
Published: (2024)
by: Giraldo, José, et al.
Published: (2024)
SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information
by: Zhang, Xiangyu, et al.
Published: (2025)
by: Zhang, Xiangyu, et al.
Published: (2025)
FairASR: Fair Audio Contrastive Learning for Automatic Speech Recognition
by: Kim, Jongsuk, et al.
Published: (2025)
by: Kim, Jongsuk, et al.
Published: (2025)
Audio-Visual Feature Synchronization for Robust Speech Enhancement in Hearing Aids
by: Saleem, Nasir, et al.
Published: (2025)
by: Saleem, Nasir, et al.
Published: (2025)
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations
by: Yang, Xiaoyu, et al.
Published: (2025)
by: Yang, Xiaoyu, et al.
Published: (2025)
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
by: Li, Jiaqi, et al.
Published: (2024)
by: Li, Jiaqi, et al.
Published: (2024)
ASPED: An Audio Dataset for Detecting Pedestrians
by: Seshadri, Pavan, et al.
Published: (2023)
by: Seshadri, Pavan, et al.
Published: (2023)
Uncovering the Visual Contribution in Audio-Visual Speech Recognition
by: Lin, Zhaofeng, et al.
Published: (2024)
by: Lin, Zhaofeng, et al.
Published: (2024)
Multimodal Representation Loss Between Timed Text and Audio for Regularized Speech Separation
by: Hsieh, Tsun-An, et al.
Published: (2024)
by: Hsieh, Tsun-An, et al.
Published: (2024)
Tracking Listener Attention: Gaze-Guided Audio-Visual Speech Enhancement Framework
by: Yang, Hsiang-Cheng, et al.
Published: (2026)
by: Yang, Hsiang-Cheng, et al.
Published: (2026)
Cross-Modal Bottleneck Fusion For Noise Robust Audio-Visual Speech Recognition
by: Ok, Seaone, et al.
Published: (2026)
by: Ok, Seaone, et al.
Published: (2026)
Rethinking Mamba in Speech Processing by Self-Supervised Models
by: Zhang, Xiangyu, et al.
Published: (2024)
by: Zhang, Xiangyu, et al.
Published: (2024)
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
by: Yang, Qian, et al.
Published: (2024)
by: Yang, Qian, et al.
Published: (2024)
SaSLaW: Dialogue Speech Corpus with Audio-visual Egocentric Information Toward Environment-adaptive Dialogue Speech Synthesis
by: Take, Osamu, et al.
Published: (2024)
by: Take, Osamu, et al.
Published: (2024)
A Generalist Audio Foundation Model for Comprehensive Body Sound Auscultation
by: Wang, Pingjie, et al.
Published: (2024)
by: Wang, Pingjie, et al.
Published: (2024)
LongCat-Audio-Codec: An Audio Tokenizer and Detokenizer Solution Designed for Speech Large Language Models
by: Zhao, Xiaohan, et al.
Published: (2025)
by: Zhao, Xiaohan, et al.
Published: (2025)
Similar Items
-
French Listening Tests for the Assessment of Intelligibility, Quality, and Identity of Body-Conducted Speech Enhancement
by: Joubaud, Thomas, et al.
Published: (2025) -
Configurable EBEN: Extreme Bandwidth Extension Network to enhance body-conducted speech capture
by: Hauret, Julien, et al.
Published: (2023) -
EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
by: Hauret, Julien, et al.
Published: (2022) -
Real-time speech enhancement in noise for throat microphone using neural audio codec as foundation model
by: Hauret, Julien, et al.
Published: (2025) -
Bringing Interpretability to Neural Audio Codecs
by: Sadok, Samir, et al.
Published: (2025)