:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Deng, Keqi, Woodland, Philip C.
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2406.04541
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Label-Synchronous Neural Transducer for Adaptable Online E2E Speech Recognition
by: Deng, Keqi, et al.
Published: (2023)

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
by: Deng, Keqi, et al.
Published: (2025)

Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition
by: Deng, Keqi, et al.
Published: (2024)

Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning
by: Deng, Keqi, et al.
Published: (2024)

HENT-SRT: Hierarchical Efficient Neural Transducer with Self-Distillation for Joint Speech Recognition and Translation
by: Hussein, Amir, et al.
Published: (2025)

High-Fidelity Simultaneous Speech-To-Speech Translation
by: Labiausse, Tom, et al.
Published: (2025)

TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer
by: Bataev, Vladimir, et al.
Published: (2025)

SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation
by: Le, Chenyang, et al.
Published: (2025)

Transducer Consistency Regularization for Speech to Text Applications
by: Tseng, Cindy, et al.
Published: (2024)

Simultaneous Speech-to-Speech Translation Without Aligned Data
by: Labiausse, Tom, et al.
Published: (2026)

Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
by: Sun, Guangzhi, et al.
Published: (2022)

Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression
by: Wu, Wen, et al.
Published: (2023)

Distribution-based Emotion Recognition in Conversation
by: Wu, Wen, et al.
Published: (2022)

NAIST Simultaneous Speech Translation System for IWSLT 2024
by: Ko, Yuka, et al.
Published: (2024)

Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
by: Huang, Wuwei, et al.
Published: (2025)

End-to-End Speech Translation for Low-Resource Languages Using Weakly Labeled Data
by: Pothula, Aishwarya, et al.
Published: (2025)

SimulTron: On-Device Simultaneous Speech to Speech Translation
by: Agranovich, Alex, et al.
Published: (2024)

A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
by: Liu, Xiaoqian, et al.
Published: (2024)

Direct Speech-to-Speech Neural Machine Translation: A Survey
by: Gupta, Mahendra, et al.
Published: (2024)

StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning
by: Zhang, Shaolei, et al.
Published: (2024)

Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice
by: Cheng, Shanbo, et al.
Published: (2025)

Self-Supervised Learning for Multi-Channel Neural Transducer
by: Kojima, Atsushi
Published: (2024)

Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation
by: Wang, Peidong, et al.
Published: (2025)

SimulU: Training-free Policy for Long-form Simultaneous Speech-to-Speech Translation
by: Djanibekov, Amirbek, et al.
Published: (2026)

CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition
by: Zhang, Tian-Hao, et al.
Published: (2023)

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
by: Cheng, Shanbo, et al.
Published: (2024)

Textless Speech-to-Speech Translation With Limited Parallel Data
by: Diwan, Anuj, et al.
Published: (2023)

TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR
by: Kumar, Shashi, et al.
Published: (2024)

Recent Advances in End-to-End Simultaneous Speech Translation
by: Liu, Xiaoqian, et al.
Published: (2024)

Transcribing and Translating, Fast and Slow: Joint Speech Translation and Recognition
by: Moritz, Niko, et al.
Published: (2024)

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
by: Kim, Minchan, et al.
Published: (2024)

REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation
by: Hirschkind, Nameer, et al.
Published: (2025)

SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation
by: Papi, Sara, et al.
Published: (2024)

Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
by: Xu, Hainan, et al.
Published: (2024)

A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Speech Translation
by: Ma, Zhengrui, et al.
Published: (2024)

Zero-resource Speech Translation and Recognition with LLMs
by: Mundnich, Karel, et al.
Published: (2024)

Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
by: Choi, Jeongsoo, et al.
Published: (2025)

Direct Speech to Speech Translation: A Review
by: Sarim, Mohammad, et al.
Published: (2025)

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
by: Wang, Peng, et al.
Published: (2023)

Promptformer: Prompted Conformer Transducer for ASR
by: Duarte-Torres, Sergio, et al.
Published: (2024)