Saved in:
| Main Authors: | Deng, Keqi, Woodland, Philip C. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.04541 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Label-Synchronous Neural Transducer for Adaptable Online E2E Speech Recognition
by: Deng, Keqi, et al.
Published: (2023)
by: Deng, Keqi, et al.
Published: (2023)
SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
by: Deng, Keqi, et al.
Published: (2025)
by: Deng, Keqi, et al.
Published: (2025)
Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition
by: Deng, Keqi, et al.
Published: (2024)
by: Deng, Keqi, et al.
Published: (2024)
Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning
by: Deng, Keqi, et al.
Published: (2024)
by: Deng, Keqi, et al.
Published: (2024)
HENT-SRT: Hierarchical Efficient Neural Transducer with Self-Distillation for Joint Speech Recognition and Translation
by: Hussein, Amir, et al.
Published: (2025)
by: Hussein, Amir, et al.
Published: (2025)
High-Fidelity Simultaneous Speech-To-Speech Translation
by: Labiausse, Tom, et al.
Published: (2025)
by: Labiausse, Tom, et al.
Published: (2025)
TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer
by: Bataev, Vladimir, et al.
Published: (2025)
by: Bataev, Vladimir, et al.
Published: (2025)
SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation
by: Le, Chenyang, et al.
Published: (2025)
by: Le, Chenyang, et al.
Published: (2025)
Transducer Consistency Regularization for Speech to Text Applications
by: Tseng, Cindy, et al.
Published: (2024)
by: Tseng, Cindy, et al.
Published: (2024)
Simultaneous Speech-to-Speech Translation Without Aligned Data
by: Labiausse, Tom, et al.
Published: (2026)
by: Labiausse, Tom, et al.
Published: (2026)
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
by: Sun, Guangzhi, et al.
Published: (2022)
by: Sun, Guangzhi, et al.
Published: (2022)
Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression
by: Wu, Wen, et al.
Published: (2023)
by: Wu, Wen, et al.
Published: (2023)
Distribution-based Emotion Recognition in Conversation
by: Wu, Wen, et al.
Published: (2022)
by: Wu, Wen, et al.
Published: (2022)
NAIST Simultaneous Speech Translation System for IWSLT 2024
by: Ko, Yuka, et al.
Published: (2024)
by: Ko, Yuka, et al.
Published: (2024)
Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
by: Huang, Wuwei, et al.
Published: (2025)
by: Huang, Wuwei, et al.
Published: (2025)
End-to-End Speech Translation for Low-Resource Languages Using Weakly Labeled Data
by: Pothula, Aishwarya, et al.
Published: (2025)
by: Pothula, Aishwarya, et al.
Published: (2025)
SimulTron: On-Device Simultaneous Speech to Speech Translation
by: Agranovich, Alex, et al.
Published: (2024)
by: Agranovich, Alex, et al.
Published: (2024)
A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
by: Liu, Xiaoqian, et al.
Published: (2024)
by: Liu, Xiaoqian, et al.
Published: (2024)
Direct Speech-to-Speech Neural Machine Translation: A Survey
by: Gupta, Mahendra, et al.
Published: (2024)
by: Gupta, Mahendra, et al.
Published: (2024)
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning
by: Zhang, Shaolei, et al.
Published: (2024)
by: Zhang, Shaolei, et al.
Published: (2024)
Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice
by: Cheng, Shanbo, et al.
Published: (2025)
by: Cheng, Shanbo, et al.
Published: (2025)
Self-Supervised Learning for Multi-Channel Neural Transducer
by: Kojima, Atsushi
Published: (2024)
by: Kojima, Atsushi
Published: (2024)
Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation
by: Wang, Peidong, et al.
Published: (2025)
by: Wang, Peidong, et al.
Published: (2025)
SimulU: Training-free Policy for Long-form Simultaneous Speech-to-Speech Translation
by: Djanibekov, Amirbek, et al.
Published: (2026)
by: Djanibekov, Amirbek, et al.
Published: (2026)
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition
by: Zhang, Tian-Hao, et al.
Published: (2023)
by: Zhang, Tian-Hao, et al.
Published: (2023)
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
by: Cheng, Shanbo, et al.
Published: (2024)
by: Cheng, Shanbo, et al.
Published: (2024)
Textless Speech-to-Speech Translation With Limited Parallel Data
by: Diwan, Anuj, et al.
Published: (2023)
by: Diwan, Anuj, et al.
Published: (2023)
TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR
by: Kumar, Shashi, et al.
Published: (2024)
by: Kumar, Shashi, et al.
Published: (2024)
Recent Advances in End-to-End Simultaneous Speech Translation
by: Liu, Xiaoqian, et al.
Published: (2024)
by: Liu, Xiaoqian, et al.
Published: (2024)
Transcribing and Translating, Fast and Slow: Joint Speech Translation and Recognition
by: Moritz, Niko, et al.
Published: (2024)
by: Moritz, Niko, et al.
Published: (2024)
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
by: Kim, Minchan, et al.
Published: (2024)
by: Kim, Minchan, et al.
Published: (2024)
REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation
by: Hirschkind, Nameer, et al.
Published: (2025)
by: Hirschkind, Nameer, et al.
Published: (2025)
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation
by: Papi, Sara, et al.
Published: (2024)
by: Papi, Sara, et al.
Published: (2024)
Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
by: Xu, Hainan, et al.
Published: (2024)
by: Xu, Hainan, et al.
Published: (2024)
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Speech Translation
by: Ma, Zhengrui, et al.
Published: (2024)
by: Ma, Zhengrui, et al.
Published: (2024)
Zero-resource Speech Translation and Recognition with LLMs
by: Mundnich, Karel, et al.
Published: (2024)
by: Mundnich, Karel, et al.
Published: (2024)
Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
by: Choi, Jeongsoo, et al.
Published: (2025)
by: Choi, Jeongsoo, et al.
Published: (2025)
Direct Speech to Speech Translation: A Review
by: Sarim, Mohammad, et al.
Published: (2025)
by: Sarim, Mohammad, et al.
Published: (2025)
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
by: Wang, Peng, et al.
Published: (2023)
by: Wang, Peng, et al.
Published: (2023)
Promptformer: Prompted Conformer Transducer for ASR
by: Duarte-Torres, Sergio, et al.
Published: (2024)
by: Duarte-Torres, Sergio, et al.
Published: (2024)
Similar Items
-
Label-Synchronous Neural Transducer for Adaptable Online E2E Speech Recognition
by: Deng, Keqi, et al.
Published: (2023) -
SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
by: Deng, Keqi, et al.
Published: (2025) -
Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition
by: Deng, Keqi, et al.
Published: (2024) -
Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning
by: Deng, Keqi, et al.
Published: (2024) -
HENT-SRT: Hierarchical Efficient Neural Transducer with Self-Distillation for Joint Speech Recognition and Translation
by: Hussein, Amir, et al.
Published: (2025)