Saved in:
| Main Authors: | Li, Siqi, Liu, Danni, Niehues, Jan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.09009 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Conditions for Catastrophic Forgetting in Multilingual Translation
by: Liu, Danni, et al.
Published: (2025)
by: Liu, Danni, et al.
Published: (2025)
How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?
by: Liu, Danni, et al.
Published: (2023)
by: Liu, Danni, et al.
Published: (2023)
Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs
by: Liu, Danni, et al.
Published: (2025)
by: Liu, Danni, et al.
Published: (2025)
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024
by: Koneru, Sai, et al.
Published: (2024)
by: Koneru, Sai, et al.
Published: (2024)
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
by: Lee, Hyunji, et al.
Published: (2024)
by: Lee, Hyunji, et al.
Published: (2024)
Language-Independent Representations Improve Zero-Shot Summarization
by: Solovyev, Vladimir, et al.
Published: (2024)
by: Solovyev, Vladimir, et al.
Published: (2024)
KIT's Low-resource Speech Translation Systems for IWSLT2025: System Enhancement with Synthetic Data and Model Regularization
by: Li, Zhaolin, et al.
Published: (2025)
by: Li, Zhaolin, et al.
Published: (2025)
In-context Language Learning for Endangered Languages in Speech Recognition
by: Li, Zhaolin, et al.
Published: (2025)
by: Li, Zhaolin, et al.
Published: (2025)
Contrastive Learning for Task-Independent SpeechLLM-Pretraining
by: Züfle, Maike, et al.
Published: (2024)
by: Züfle, Maike, et al.
Published: (2024)
RASST: Fast Cross-modal Retrieval-Augmented Simultaneous Speech Translation
by: Luo, Jiaxuan, et al.
Published: (2026)
by: Luo, Jiaxuan, et al.
Published: (2026)
End-to-End Evaluation for Low-Latency Simultaneous Speech Translation
by: Huber, Christian, et al.
Published: (2023)
by: Huber, Christian, et al.
Published: (2023)
KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025
by: Koneru, Sai, et al.
Published: (2025)
by: Koneru, Sai, et al.
Published: (2025)
OmniFusion: Simultaneous Multilingual Multimodal Translations via Modular Fusion
by: Koneru, Sai, et al.
Published: (2025)
by: Koneru, Sai, et al.
Published: (2025)
Augmenting Automatic Speech Recognition Models with Disfluency Detection
by: Amann, Robin, et al.
Published: (2024)
by: Amann, Robin, et al.
Published: (2024)
Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech
by: Ouyang, Siqi, et al.
Published: (2026)
by: Ouyang, Siqi, et al.
Published: (2026)
CMU's IWSLT 2025 Simultaneous Speech Translation System
by: Ouyang, Siqi, et al.
Published: (2025)
by: Ouyang, Siqi, et al.
Published: (2025)
Plug, Play, and Fuse: Zero-Shot Joint Decoding via Word-Level Re-ranking Across Diverse Vocabularies
by: Koneru, Sai, et al.
Published: (2024)
by: Koneru, Sai, et al.
Published: (2024)
Multimodal In-context Learning for ASR of Low-resource Languages
by: Li, Zhaolin, et al.
Published: (2026)
by: Li, Zhaolin, et al.
Published: (2026)
Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing
by: Koneru, Sai, et al.
Published: (2023)
by: Koneru, Sai, et al.
Published: (2023)
When Helpful Context Leaks: Privacy Risks in Domain-Adapted ASR
by: Züfle, Maike, et al.
Published: (2026)
by: Züfle, Maike, et al.
Published: (2026)
Improving Rare Word Translation With Dictionaries and Attention Masking
by: Sible, Kenneth J., et al.
Published: (2024)
by: Sible, Kenneth J., et al.
Published: (2024)
Speech Editing -- a Summary
by: Kässmann, Tobias, et al.
Published: (2024)
by: Kässmann, Tobias, et al.
Published: (2024)
Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
by: Sperber, Matthias, et al.
Published: (2024)
by: Sperber, Matthias, et al.
Published: (2024)
Early-Exit and Instant Confidence Translation Quality Estimation
by: Zouhar, Vilém, et al.
Published: (2025)
by: Zouhar, Vilém, et al.
Published: (2025)
FASST: Fast LLM-based Simultaneous Speech Translation
by: Ouyang, Siqi, et al.
Published: (2024)
by: Ouyang, Siqi, et al.
Published: (2024)
Do Slides Help? Multi-modal Context for Automatic Transcription of Conference Talks
by: Sinhamahapatra, Supriti, et al.
Published: (2025)
by: Sinhamahapatra, Supriti, et al.
Published: (2025)
InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model
by: Ouyang, Siqi, et al.
Published: (2025)
by: Ouyang, Siqi, et al.
Published: (2025)
SpeechQE: Estimating the Quality of Direct Speech Translation
by: Han, HyoJung, et al.
Published: (2024)
by: Han, HyoJung, et al.
Published: (2024)
MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks
by: Papi, Sara, et al.
Published: (2025)
by: Papi, Sara, et al.
Published: (2025)
Talk2Ref: A Dataset for Reference Prediction from Scientific Talks
by: Broy, Frederik, et al.
Published: (2025)
by: Broy, Frederik, et al.
Published: (2025)
CA*: Addressing Evaluation Pitfalls in Computation-Aware Latency for Simultaneous Speech Translation
by: Xu, Xi, et al.
Published: (2024)
by: Xu, Xi, et al.
Published: (2024)
Unveiling the Role of Pretraining in Direct Speech Translation
by: Alastruey, Belen, et al.
Published: (2024)
by: Alastruey, Belen, et al.
Published: (2024)
Direct Speech to Speech Translation: A Review
by: Sarim, Mohammad, et al.
Published: (2025)
by: Sarim, Mohammad, et al.
Published: (2025)
Are Generative Models Underconfident? Better Quality Estimation with Boosted Model Probability
by: Dinh, Tu Anh, et al.
Published: (2025)
by: Dinh, Tu Anh, et al.
Published: (2025)
Sigmoid Head for Quality Estimation under Language Ambiguity
by: Dinh, Tu Anh, et al.
Published: (2026)
by: Dinh, Tu Anh, et al.
Published: (2026)
Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection
by: Kim, Yumin, et al.
Published: (2025)
by: Kim, Yumin, et al.
Published: (2025)
TARQ: Tail-Aware Reconstruction Quantization for Rare-Word Robust Automatic Speech Recognition
by: Wang, Xinyu, et al.
Published: (2026)
by: Wang, Xinyu, et al.
Published: (2026)
Word Level Timestamp Generation for Automatic Speech Recognition and Translation
by: Hu, Ke, et al.
Published: (2025)
by: Hu, Ke, et al.
Published: (2025)
Simulstream: Open-Source Toolkit for Evaluation and Demonstration of Streaming Speech-to-Text Translation Systems
by: Gaido, Marco, et al.
Published: (2025)
by: Gaido, Marco, et al.
Published: (2025)
MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations
by: Scott, Aaron, et al.
Published: (2025)
by: Scott, Aaron, et al.
Published: (2025)
Similar Items
-
Conditions for Catastrophic Forgetting in Multilingual Translation
by: Liu, Danni, et al.
Published: (2025) -
How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?
by: Liu, Danni, et al.
Published: (2023) -
Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs
by: Liu, Danni, et al.
Published: (2025) -
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024
by: Koneru, Sai, et al.
Published: (2024) -
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
by: Lee, Hyunji, et al.
Published: (2024)