:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Masiero, Bruno S., Borges, Leticia R., Dillon, Harvey, Colella-Santos, Maria Francisca
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing Sound
Online Access:	https://arxiv.org/abs/2409.04014
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TranSentence: Speech-to-speech Translation via Language-agnostic Sentence-level Speech Encoding without Language-parallel Data
by: Kim, Seung-Bin, et al.
Published: (2024)

Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model
by: Santos, Arthur N. dos, et al.
Published: (2024)

Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
by: Matsuura, Kohei, et al.
Published: (2024)

Silent Speech Sentence Recognition with Six-Axis Accelerometers using Conformer and CTC Algorithm
by: Xie, Yudong, et al.
Published: (2025)

Exploring Sentence Type Effects on the Lombard Effect and Intelligibility Enhancement: A Comparative Study of Natural and Grid Sentences
by: Chen, Hongyang, et al.
Published: (2023)

EMALG: An Enhanced Mandarin Lombard Grid Corpus with Meaningful Sentences
by: Li, Baifeng, et al.
Published: (2023)

A Survey on 30+ Years of Automatic Singing Assessment and Singing Information Processing
by: Santos, Arthur N. dos, et al.
Published: (2026)

French Listening Tests for the Assessment of Intelligibility, Quality, and Identity of Body-Conducted Speech Enhancement
by: Joubaud, Thomas, et al.
Published: (2025)

WHISTRESS: Enriching Transcriptions with Sentence Stress Detection
by: Yosha, Iddo, et al.
Published: (2025)

Evaluation of an ITD-to-ILD Transformation as a Method to Restore the Spatial Benefit in Speech Intelligibility in Hearing Impaired Listeners
by: Bäumer, Timm-Jonas, et al.
Published: (2025)

Listen First, Then Answer: Timestamp-Grounded Speech Reasoning
by: Jeong, Jihoon, et al.
Published: (2026)

Evaluating Speech Enhancement Systems Through Listening Effort
by: Gelderblom, Femke B., et al.
Published: (2024)

Tracking Listener Attention: Gaze-Guided Audio-Visual Speech Enhancement Framework
by: Yang, Hsiang-Cheng, et al.
Published: (2026)

SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HuBERT
by: Cho, Cheol Jun, et al.
Published: (2023)

Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs
by: Behringer, Lyonel, et al.
Published: (2026)

Listen through the Sound: Generative Speech Restoration Leveraging Acoustic Context Representation
by: Chung, Soo-Whan, et al.
Published: (2025)

Unifying Listener Scoring Scales: Comparison Learning Framework for Speech Quality Assessment and Continuous Speech Emotion Recognition
by: Hu, Cheng-Hung, et al.
Published: (2025)

Evaluation of Google's Voice Recognition and Sentence Classification for Health Care Applications
by: Uddin, Majbah, et al.
Published: (2024)

Non-Intrusive Binaural Speech Intelligibility Prediction Using Mamba for Hearing-Impaired Listeners
by: Yamamoto, Katsuhiko, et al.
Published: (2025)

Listening Between the Lines: Synthetic Speech Detection Disregarding Verbal Content
by: Salvi, Davide, et al.
Published: (2024)

Extracting accent features in spoken Brazilian Portuguese without sociolinguistic labels
by: Leite, Pedro H. L., et al.
Published: (2026)

Investigation of Speech and Noise Latent Representations in Single-channel VAE-based Speech Enhancement
by: Li, Jiatong, et al.
Published: (2025)

Flexible Multichannel Speech Enhancement for Noise-Robust Frontend
by: Jukić, Ante, et al.
Published: (2024)

Noise-robust Speech Separation with Fast Generative Correction
by: Wang, Helin, et al.
Published: (2024)

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
by: Chen, Zhengyang, et al.
Published: (2024)

A Neural Speech Codec for Noise Robust Speech Coding
by: Huang, Jiayi, et al.
Published: (2023)

NAST: Noise Aware Speech Tokenization for Speech Language Models
by: Messica, Shoval, et al.
Published: (2024)

Leveraging Spatial Cues from Cochlear Implant Microphones to Efficiently Enhance Speech Separation in Real-World Listening Scenes
by: Olalere, Feyisayo, et al.
Published: (2025)

Can Speech LLMs Think while Listening?
by: Shih, Yi-Jen, et al.
Published: (2025)

Zero-Shot Imagined Speech Decoding via Imagined-to-Listened MEG Mapping
by: Maghsoudi, Maryam, et al.
Published: (2026)

Hearing-Loss Compensation Using Deep Neural Networks: A Framework and Results From a Listening Test
by: Leer, Peter, et al.
Published: (2024)

Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
by: Rahimi, Akam, et al.
Published: (2025)

SpatialCodec: Neural Spatial Speech Coding
by: Xu, Zhongweiyang, et al.
Published: (2023)

CAMÕES: A Comprehensive Automatic Speech Recognition Benchmark for European Portuguese
by: Carvalho, Carlos, et al.
Published: (2025)

Listen, Think, and Understand
by: Gong, Yuan, et al.
Published: (2023)

Noise-Aware Speech Separation with Contrastive Learning
by: Zhang, Zizheng, et al.
Published: (2023)

Cochleagram-based Noise Adapted Speaker Identification System for Distorted Speech
by: Ahmed, Sabbir, et al.
Published: (2025)

Audio-Visual Speech Enhancement for Spatial Audio - Spatial-VisualVoice and the MAVE Database
by: Yaffe, Danielle, et al.
Published: (2025)

Past, Present, and Future of Spatial Audio and Room Acoustics
by: Koyama, Shoichi, et al.
Published: (2025)

A Pre-training Framework that Encodes Noise Information for Speech Quality Assessment
by: Sultana, Subrina, et al.
Published: (2024)