:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Chen, Tuochao, Wang, Qirui, He, Runlin, Gollakota, Shyam
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computation and Language Sound Audio and Speech Processing
Accesso online:	https://arxiv.org/abs/2504.18715
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Wireless Hearables With Programmable Speech AI Accelerators
di: Itani, Malek, et al.
Pubblicazione: (2025)

Proactive Hearing Assistants that Isolate Egocentric Conversations
di: Hu, Guilin, et al.
Pubblicazione: (2025)

TF-MLPNet: Tiny Real-Time Neural Speech Separation
di: Itani, Malek, et al.
Pubblicazione: (2025)

LLAMAPIE: Proactive In-Ear Conversation Assistants
di: Chen, Tuochao, et al.
Pubblicazione: (2025)

Look Once to Hear: Target Speech Hearing with Noisy Examples
di: Veluri, Bandhav, et al.
Pubblicazione: (2024)

High-Fidelity Simultaneous Speech-To-Speech Translation
di: Labiausse, Tom, et al.
Pubblicazione: (2025)

Direct Speech to Speech Translation: A Review
di: Sarim, Mohammad, et al.
Pubblicazione: (2025)

DiariST: Streaming Speech Translation with Speaker Diarization
di: Yang, Mu, et al.
Pubblicazione: (2023)

Representation Purification for End-to-End Speech Translation
di: Zhang, Chengwei, et al.
Pubblicazione: (2024)

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
di: Deng, Keqi, et al.
Pubblicazione: (2025)

Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation
di: Liu, Henglyu, et al.
Pubblicazione: (2025)

Word Level Timestamp Generation for Automatic Speech Recognition and Translation
di: Hu, Ke, et al.
Pubblicazione: (2025)

Simultaneous Speech-to-Speech Translation Without Aligned Data
di: Labiausse, Tom, et al.
Pubblicazione: (2026)

Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs
di: Futami, Hayato, et al.
Pubblicazione: (2025)

Direct Speech-to-Speech Neural Machine Translation: A Survey
di: Gupta, Mahendra, et al.
Pubblicazione: (2024)

Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation
di: Akarsh, Sai, et al.
Pubblicazione: (2024)

A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation
di: Liu, Xiaoqian, et al.
Pubblicazione: (2024)

Dub-S2ST: Textless Speech-to-Speech Translation for Seamless Dubbing
di: Choi, Jeongsoo, et al.
Pubblicazione: (2025)

Compact Speech Translation Models via Discrete Speech Units Pretraining
di: Lam, Tsz Kin, et al.
Pubblicazione: (2024)

StreamUni: Achieving Streaming Speech Translation with a Unified Large Speech-Language Model
di: Guo, Shoutao, et al.
Pubblicazione: (2025)

MTP-S2UT: Enhancing Speech-to-Speech Translation Quality with Multi-token Prediction
di: Wang, Jianjin, et al.
Pubblicazione: (2025)

Lightweight Audio Segmentation for Long-form Speech Translation
di: Lee, Jaesong, et al.
Pubblicazione: (2024)

End-to-End Speech-to-Text Translation: A Survey
di: Sethiya, Nivedita, et al.
Pubblicazione: (2023)

NAIST Simultaneous Speech Translation System for IWSLT 2024
di: Ko, Yuka, et al.
Pubblicazione: (2024)

Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation
di: Hwang, Min-Jae, et al.
Pubblicazione: (2024)

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
di: Wang, Peidong, et al.
Pubblicazione: (2024)

Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech
di: Wotherspoon, Shannon, et al.
Pubblicazione: (2024)

Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
di: Huang, Wuwei, et al.
Pubblicazione: (2025)

Efficient Speech Translation through Model Compression and Knowledge Distillation
di: Moslem, Yasmin
Pubblicazione: (2025)

MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
di: Peng, Yifan, et al.
Pubblicazione: (2024)

From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition
di: Wang, Tianduo, et al.
Pubblicazione: (2025)

Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?
di: Tsiamas, Ioannis, et al.
Pubblicazione: (2024)

MoVE: Translating Laughter and Tears via Mixture of Vocalization Experts in Speech-to-Speech Translation
di: Chen, Szu-Chi, et al.
Pubblicazione: (2026)

Novel Parasitic Dual-Scale Modeling for Efficient and Accurate Multilingual Speech Translation
di: Le, Chenyang, et al.
Pubblicazione: (2025)

SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation
di: Le, Chenyang, et al.
Pubblicazione: (2025)

Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection
di: Lin, Hsi-Che, et al.
Pubblicazione: (2024)

Bemba Speech Translation: Exploring a Low-Resource African Language
di: Farouq, Muhammad Hazim Al, et al.
Pubblicazione: (2025)

Investigating Decoder-only Large Language Models for Speech-to-text Translation
di: Huang, Chao-Wei, et al.
Pubblicazione: (2024)

Recent Advances in End-to-End Simultaneous Speech Translation
di: Liu, Xiaoqian, et al.
Pubblicazione: (2024)

PHRASED: Phrase Dictionary Biasing for Speech Translation
di: Wang, Peidong, et al.
Pubblicazione: (2025)