Saved in:
| Main Author: | Kopparapu, Sunil Kumar |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.14427 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)
Probing Human Articulatory Constraints in End-to-End TTS with Reverse and Mismatched Speech-Text Directions
by: Khadse, Parth, et al.
Published: (2026)
by: Khadse, Parth, et al.
Published: (2026)
Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax
by: Patil, Aditya, et al.
Published: (2024)
by: Patil, Aditya, et al.
Published: (2024)
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
by: Jiang, Jintao, et al.
Published: (2024)
by: Jiang, Jintao, et al.
Published: (2024)
End-to-End Speech-to-Text Translation: A Survey
by: Sethiya, Nivedita, et al.
Published: (2023)
by: Sethiya, Nivedita, et al.
Published: (2023)
Improving noisy student training for low-resource languages in End-to-End ASR using CycleGAN and inter-domain losses
by: Li, Chia-Yu, et al.
Published: (2024)
by: Li, Chia-Yu, et al.
Published: (2024)
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
by: Li, Tianpeng, et al.
Published: (2025)
by: Li, Tianpeng, et al.
Published: (2025)
End-to-end Joint Punctuated and Normalized ASR with a Limited Amount of Punctuated Training Data
by: Cui, Can, et al.
Published: (2023)
by: Cui, Can, et al.
Published: (2023)
Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation Network
by: Dissen, Yehoshua, et al.
Published: (2024)
by: Dissen, Yehoshua, et al.
Published: (2024)
Representation Purification for End-to-End Speech Translation
by: Zhang, Chengwei, et al.
Published: (2024)
by: Zhang, Chengwei, et al.
Published: (2024)
End-to-End Simultaneous Dysarthric Speech Reconstruction with Frame-Level Adaptor and Multiple Wait-k Knowledge Distillation
by: Wu, Minghui, et al.
Published: (2026)
by: Wu, Minghui, et al.
Published: (2026)
Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework
by: Munakata, Hokuto, et al.
Published: (2024)
by: Munakata, Hokuto, et al.
Published: (2024)
Reflecting Twice before Speaking with Empathy: Self-Reflective Alternating Inference for Empathy-Aware End-to-End Spoken Dialogue
by: Jia, Yuhang, et al.
Published: (2026)
by: Jia, Yuhang, et al.
Published: (2026)
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Speech Translation
by: Ma, Zhengrui, et al.
Published: (2024)
by: Ma, Zhengrui, et al.
Published: (2024)
Joint Speech and Text Training for LLM-Based End-to-End Spoken Dialogue State Tracking
by: Vendrame, Katia, et al.
Published: (2025)
by: Vendrame, Katia, et al.
Published: (2025)
An End-to-End Speech Summarization Using Large Language Model
by: Shang, Hengchao, et al.
Published: (2024)
by: Shang, Hengchao, et al.
Published: (2024)
An investigation of phrase break prediction in an End-to-End TTS system
by: Vadapalli, Anandaswarup
Published: (2023)
by: Vadapalli, Anandaswarup
Published: (2023)
Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models
by: Feng, Sheng, et al.
Published: (2024)
by: Feng, Sheng, et al.
Published: (2024)
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
by: Huang, Jiawen, et al.
Published: (2024)
by: Huang, Jiawen, et al.
Published: (2024)
Scaling and Prompting for Improved End-to-End Spoken Grammatical Error Correction
by: Qian, Mengjie, et al.
Published: (2025)
by: Qian, Mengjie, et al.
Published: (2025)
Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
by: Huang, Wuwei, et al.
Published: (2025)
by: Huang, Wuwei, et al.
Published: (2025)
Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review
by: Agro, Maha Tufail, et al.
Published: (2025)
by: Agro, Maha Tufail, et al.
Published: (2025)
Acoustically Precise Hesitation Tagging Is Essential for End-to-End Verbatim Transcription Systems
by: Lin, Jhen-Ke, et al.
Published: (2025)
by: Lin, Jhen-Ke, et al.
Published: (2025)
PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
by: Le, Trang, et al.
Published: (2024)
by: Le, Trang, et al.
Published: (2024)
Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
by: Kocour, Martin, et al.
Published: (2025)
by: Kocour, Martin, et al.
Published: (2025)
Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation
by: Moslem, Yasmin
Published: (2024)
by: Moslem, Yasmin
Published: (2024)
End-to-End Spoken Grammatical Error Correction
by: Qian, Mengjie, et al.
Published: (2025)
by: Qian, Mengjie, et al.
Published: (2025)
SAGE-LD: Towards Scalable and Generalizable End-to-End Language Diarization via Simulated Data Augmentation
by: Lee, Sangmin, et al.
Published: (2025)
by: Lee, Sangmin, et al.
Published: (2025)
Chain-of-Thought Reasoning in Streaming Full-Duplex End-to-End Spoken Dialogue Systems
by: Arora, Siddhant, et al.
Published: (2025)
by: Arora, Siddhant, et al.
Published: (2025)
GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot
by: Zeng, Aohan, et al.
Published: (2024)
by: Zeng, Aohan, et al.
Published: (2024)
Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
by: Wang, Peidong, et al.
Published: (2024)
by: Wang, Peidong, et al.
Published: (2024)
VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models
by: Cui, Wenqian, et al.
Published: (2025)
by: Cui, Wenqian, et al.
Published: (2025)
Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview
by: Liu, Heyang, et al.
Published: (2024)
by: Liu, Heyang, et al.
Published: (2024)
Beyond Binary: Multiclass Paraphasia Detection with Generative Pretrained Transformers and End-to-End Models
by: Perez, Matthew, et al.
Published: (2024)
by: Perez, Matthew, et al.
Published: (2024)
Unifying EEG and Speech for Emotion Recognition: A Two-Step Joint Learning Framework for Handling Missing EEG Data During Inference
by: Tiwari, Upasana, et al.
Published: (2025)
by: Tiwari, Upasana, et al.
Published: (2025)
Recent Advances in End-to-End Simultaneous Speech Translation
by: Liu, Xiaoqian, et al.
Published: (2024)
by: Liu, Xiaoqian, et al.
Published: (2024)
Retrieval Augmented End-to-End Spoken Dialog Models
by: Wang, Mingqiu, et al.
Published: (2024)
by: Wang, Mingqiu, et al.
Published: (2024)
AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
by: Huang, Wuwei, et al.
Published: (2025)
by: Huang, Wuwei, et al.
Published: (2025)
Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
by: Huang, Ailin, et al.
Published: (2025)
by: Huang, Ailin, et al.
Published: (2025)
Improving Practical Aspects of End-to-End Multi-Talker Speech Recognition for Online and Offline Scenarios
by: Subramanian, Aswin Shanmugam, et al.
Published: (2025)
by: Subramanian, Aswin Shanmugam, et al.
Published: (2025)
Similar Items
-
A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system
by: Kopparapu, Sunil Kumar, et al.
Published: (2024) -
Probing Human Articulatory Constraints in End-to-End TTS with Reverse and Mismatched Speech-Text Directions
by: Khadse, Parth, et al.
Published: (2026) -
Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax
by: Patil, Aditya, et al.
Published: (2024) -
Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
by: Jiang, Jintao, et al.
Published: (2024) -
End-to-End Speech-to-Text Translation: A Survey
by: Sethiya, Nivedita, et al.
Published: (2023)