:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Kopparapu, Sunil Kumar
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Sound
Online Access:	https://arxiv.org/abs/2605.14427
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)

Probing Human Articulatory Constraints in End-to-End TTS with Reverse and Mismatched Speech-Text Directions
by: Khadse, Parth, et al.
Published: (2026)

Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax
by: Patil, Aditya, et al.
Published: (2024)

Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
by: Jiang, Jintao, et al.
Published: (2024)

End-to-End Speech-to-Text Translation: A Survey
by: Sethiya, Nivedita, et al.
Published: (2023)

Improving noisy student training for low-resource languages in End-to-End ASR using CycleGAN and inter-domain losses
by: Li, Chia-Yu, et al.
Published: (2024)

Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
by: Li, Tianpeng, et al.
Published: (2025)

End-to-end Joint Punctuated and Normalized ASR with a Limited Amount of Punctuated Training Data
by: Cui, Can, et al.
Published: (2023)

Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation Network
by: Dissen, Yehoshua, et al.
Published: (2024)

Representation Purification for End-to-End Speech Translation
by: Zhang, Chengwei, et al.
Published: (2024)

End-to-End Simultaneous Dysarthric Speech Reconstruction with Frame-Level Adaptor and Multiple Wait-k Knowledge Distillation
by: Wu, Minghui, et al.
Published: (2026)

Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework
by: Munakata, Hokuto, et al.
Published: (2024)

Reflecting Twice before Speaking with Empathy: Self-Reflective Alternating Inference for Empathy-Aware End-to-End Spoken Dialogue
by: Jia, Yuhang, et al.
Published: (2026)

A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Speech Translation
by: Ma, Zhengrui, et al.
Published: (2024)

Joint Speech and Text Training for LLM-Based End-to-End Spoken Dialogue State Tracking
by: Vendrame, Katia, et al.
Published: (2025)

An End-to-End Speech Summarization Using Large Language Model
by: Shang, Hengchao, et al.
Published: (2024)

An investigation of phrase break prediction in an End-to-End TTS system
by: Vadapalli, Anandaswarup
Published: (2023)

Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models
by: Feng, Sheng, et al.
Published: (2024)

Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
by: Huang, Jiawen, et al.
Published: (2024)

Scaling and Prompting for Improved End-to-End Spoken Grammatical Error Correction
by: Qian, Mengjie, et al.
Published: (2025)

Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
by: Huang, Wuwei, et al.
Published: (2025)

Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review
by: Agro, Maha Tufail, et al.
Published: (2025)

Acoustically Precise Hesitation Tagging Is Essential for End-to-End Verbatim Transcription Systems
by: Lin, Jhen-Ke, et al.
Published: (2025)

PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
by: Le, Trang, et al.
Published: (2024)

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
by: Kocour, Martin, et al.
Published: (2025)

Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation
by: Moslem, Yasmin
Published: (2024)

End-to-End Spoken Grammatical Error Correction
by: Qian, Mengjie, et al.
Published: (2025)

SAGE-LD: Towards Scalable and Generalizable End-to-End Language Diarization via Simulated Data Augmentation
by: Lee, Sangmin, et al.
Published: (2025)

Chain-of-Thought Reasoning in Streaming Full-Duplex End-to-End Spoken Dialogue Systems
by: Arora, Siddhant, et al.
Published: (2025)

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot
by: Zeng, Aohan, et al.
Published: (2024)

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
by: Wang, Peidong, et al.
Published: (2024)

VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models
by: Cui, Wenqian, et al.
Published: (2025)

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview
by: Liu, Heyang, et al.
Published: (2024)

Beyond Binary: Multiclass Paraphasia Detection with Generative Pretrained Transformers and End-to-End Models
by: Perez, Matthew, et al.
Published: (2024)

Unifying EEG and Speech for Emotion Recognition: A Two-Step Joint Learning Framework for Handling Missing EEG Data During Inference
by: Tiwari, Upasana, et al.
Published: (2025)

Recent Advances in End-to-End Simultaneous Speech Translation
by: Liu, Xiaoqian, et al.
Published: (2024)

Retrieval Augmented End-to-End Spoken Dialog Models
by: Wang, Mingqiu, et al.
Published: (2024)

AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
by: Huang, Wuwei, et al.
Published: (2025)

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
by: Huang, Ailin, et al.
Published: (2025)

Improving Practical Aspects of End-to-End Multi-Talker Speech Recognition for Online and Offline Scenarios
by: Subramanian, Aswin Shanmugam, et al.
Published: (2025)