:: Library Catalog

Image de couverture de livre

Enregistré dans:

Détails bibliographiques
Auteur principal:	Bleeck, Stefan
Format:	Preprint
Publié:	2025
Sujets:	Sound Computation and Language Audio and Speech Processing
Accès en ligne:	https://arxiv.org/abs/2506.11620
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

Advancing Hearing Assessment: An ASR-Based Frequency-Specific Speech Test for Diagnosing Presbycusis
par: Bleeck, Stefan
Publié: (2025)

DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units
par: Poli, Maxime, et autres
Publié: (2026)

Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis
par: Zhou, Kun, et autres
Publié: (2024)

Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
par: Yusuyin, Saierdaer, et autres
Publié: (2024)

PAST: Phonetic-Acoustic Speech Tokenizer
par: Har-Tuv, Nadav, et autres
Publié: (2025)

StressTest: Can YOUR Speech LM Handle the Stress?
par: Yosha, Iddo, et autres
Publié: (2025)

Phonetic Segmentation of the UCLA Phonetics Lab Archive
par: Chodroff, Eleanor, et autres
Publié: (2024)

LI-TTA: Language Informed Test-Time Adaptation for Automatic Speech Recognition
par: Yoon, Eunseop, et autres
Publié: (2024)

Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition
par: Choi, Anna Seo Gyeong, et autres
Publié: (2025)

Simultaneous Speech-to-Speech Translation Without Aligned Data
par: Labiausse, Tom, et autres
Publié: (2026)

Methods to Increase the Amount of Data for Speech Recognition for Low Resource Languages
par: Ayrapetyan, Alexan, et autres
Publié: (2025)

Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection
par: Lin, Hsi-Che, et autres
Publié: (2024)

Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition
par: Cornell, Samuele, et autres
Publié: (2024)

Direct Speech to Speech Translation: A Review
par: Sarim, Mohammad, et autres
Publié: (2025)

The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN
par: Yuan, Zheng, et autres
Publié: (2023)

Self-Supervised Speech Models Encode Phonetic Context via Position-dependent Orthogonal Subspaces
par: Choi, Kwanghee, et autres
Publié: (2026)

Acoustic to Articulatory Inversion of Speech; Data Driven Approaches, Challenges, Applications, and Future Scope
par: Pillai, Leena G, et autres
Publié: (2025)

FASA: a Flexible and Automatic Speech Aligner for Extracting High-quality Aligned Children Speech Data
par: Liu, Dancheng, et autres
Publié: (2024)

Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs
par: Futami, Hayato, et autres
Publié: (2025)

SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
par: Zhang, Xin, et autres
Publié: (2023)

Interface Design for Self-Supervised Speech Models
par: Shih, Yi-Jen, et autres
Publié: (2024)

Continuous Speech Tokenizer in Text To Speech
par: Li, Yixing, et autres
Publié: (2024)

Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference
par: Casanova, Edresson, et autres
Publié: (2024)

Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis
par: Do, Cong-Thanh, et autres
Publié: (2024)

Direct Speech-to-Speech Neural Machine Translation: A Survey
par: Gupta, Mahendra, et autres
Publié: (2024)

ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
par: Ogunremi, Tolulope, et autres
Publié: (2023)

Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond
par: Lee, Beomseok, et autres
Publié: (2024)

High-Fidelity Simultaneous Speech-To-Speech Translation
par: Labiausse, Tom, et autres
Publié: (2025)

Continual Speech Learning with Fused Speech Features
par: Wang, Guitao, et autres
Publié: (2025)

SpeechTaxi: On Multilingual Semantic Speech Classification
par: Keller, Lennart, et autres
Publié: (2024)

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
par: Deng, Keqi, et autres
Publié: (2025)

Dynamic Data Pruning for Automatic Speech Recognition
par: Xiao, Qiao, et autres
Publié: (2024)

Examining Test-Time Adaptation for Personalized Child Speech Recognition
par: Shi, Zhonghao, et autres
Publié: (2024)

Rethinking Speech Representation Aggregation in Speech Enhancement: A Phonetic Mutual Information Perspective
par: Han, Seungu, et autres
Publié: (2026)

Advancing African-Accented Speech Recognition: Epistemic Uncertainty-Driven Data Selection for Generalizable ASR Models
par: Dossou, Bonaventure F. P.
Publié: (2023)

SpeechAlign: Aligning Speech Generation to Human Preferences
par: Zhang, Dong, et autres
Publié: (2024)

MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
par: Peng, Yifan, et autres
Publié: (2024)

DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
par: Lu, Ke-Han, et autres
Publié: (2024)

SimClass: A Classroom Speech Dataset Generated via Game Engine Simulation For Automatic Speech Recognition Research
par: Attia, Ahmed Adel, et autres
Publié: (2025)

Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
par: Vesterbacka, Leonora, et autres
Publié: (2025)