:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lou, Haowei, Huang, Chengkai, Paik, Hye-young, Hu, Yongquan, Quigley, Aaron, Hu, Wen, Yao, Lina
Format:	Preprint
Published:	2025
Subjects:	Systems and Control Sound
Online Access:	https://arxiv.org/abs/2510.20113
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ParaMETA: Towards Learning Disentangled Paralinguistic Speaking Styles Representations from Speech
by: Lou, Haowei, et al.
Published: (2026)

Generalized Multilingual Text-to-Speech Generation with Language-Aware Style Adaptation
by: Lou, Haowei, et al.
Published: (2025)

ParaStyleTTS: Toward Efficient and Robust Paralinguistic Style Control for Expressive Text-to-Speech Generation
by: Lou, Haowei, et al.
Published: (2025)

StyleSpeech: Parameter-efficient Fine Tuning for Pre-trained Controllable Text-to-Speech
by: Lou, Haowei, et al.
Published: (2024)

Aligner-Guided Training Paradigm: Advancing Text-to-Speech Models with Aligner Guided Duration
by: Lou, Haowei, et al.
Published: (2024)

LatentSpeech: Latent Diffusion for Text-To-Speech Generation
by: Lou, Haowei, et al.
Published: (2024)

Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
by: Huang, Wuwei, et al.
Published: (2025)

Recent Advances in End-to-End Simultaneous Speech Translation
by: Liu, Xiaoqian, et al.
Published: (2024)

When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
by: Min, Anna, et al.
Published: (2025)

Speech-to-See: End-to-End Speech-Driven Open-Set Object Detection
by: Lu, Wenhuan, et al.
Published: (2025)

Representation Purification for End-to-End Speech Translation
by: Zhang, Chengwei, et al.
Published: (2024)

A Parallel Ultra-Low Power Silent Speech Interface based on a Wearable, Fully-dry EMG Neckband
by: Meier, Fiona, et al.
Published: (2025)

Ti-Audio: The First Multi-Dialectal End-to-End Speech LLM for Tibetan
by: Wang, Jialing, et al.
Published: (2026)

Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
by: Lin, Guan-Ting, et al.
Published: (2024)

End-to-End Speech-to-Text Translation: A Survey
by: Sethiya, Nivedita, et al.
Published: (2023)

Frequency-Specific Neural Response and Cross-Correlation Analysis of Envelope Following Responses to Native Speech and Music Using Multichannel EEG Signals: A Case Study
by: Hasan, Md. Mahbub, et al.
Published: (2025)

CosyEdit: Unlocking End-to-End Speech Editing Capability from Zero-Shot Text-to-Speech Models
by: Chen, Junyang, et al.
Published: (2026)

Probing Human Articulatory Constraints in End-to-End TTS with Reverse and Mismatched Speech-Text Directions
by: Khadse, Parth, et al.
Published: (2026)

A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Speech Translation
by: Ma, Zhengrui, et al.
Published: (2024)

ML-ARIS: Multilayer Underwater Acoustic Reconfigurable Intelligent Surface with High-Resolution Reflection Control
by: Pu, Lina, et al.
Published: (2025)

An End-to-End Speech Summarization Using Large Language Model
by: Shang, Hengchao, et al.
Published: (2024)

On Improving Error Resilience of Neural End-to-End Speech Coders
by: Gupta, Kishan, et al.
Published: (2024)

Time and Tokens: Benchmarking End-to-End Speech Dysfluency Detection
by: Zhou, Xuanru, et al.
Published: (2024)

WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
by: Zhou, Junzuo, et al.
Published: (2024)

Central Kurdish Text-to-Speech Synthesis with Novel End-to-End Transformer Training
by: Ahmad, Hawraz A., et al.
Published: (2024)

End-to-end Contrastive Language-Speech Pretraining Model For Long-form Spoken Question Answering
by: Hu, Jiliang, et al.
Published: (2025)

AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
by: Huang, Wuwei, et al.
Published: (2025)

Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment
by: Farhadipour, Aref, et al.
Published: (2023)

Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
by: Li, Tianpeng, et al.
Published: (2025)

Meta-Learning in Audio and Speech Processing: An End to End Comprehensive Review
by: Raimon, Athul, et al.
Published: (2024)

TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer
by: Bataev, Vladimir, et al.
Published: (2025)

TeraSim-World: Worldwide Safety-Critical Data Synthesis for End-to-End Autonomous Driving
by: Wang, Jiawei, et al.
Published: (2025)

FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech Synthesis
by: Guo, Yinlin, et al.
Published: (2024)

ASCEND: Accurate yet Efficient End-to-End Stochastic Computing Acceleration of Vision Transformer
by: Xie, Tong, et al.
Published: (2024)

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
by: Kocour, Martin, et al.
Published: (2025)

Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation
by: Moslem, Yasmin
Published: (2024)

An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning
by: Guimarães, Heitor R., et al.
Published: (2024)

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
by: Cheng, Shanbo, et al.
Published: (2024)

SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering
by: Lin, Chyi-Jiunn, et al.
Published: (2024)

Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review
by: Agro, Maha Tufail, et al.
Published: (2025)