:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Siqi, Liu, Danni, Niehues, Jan
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2409.09009
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Conditions for Catastrophic Forgetting in Multilingual Translation
by: Liu, Danni, et al.
Published: (2025)

How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?
by: Liu, Danni, et al.
Published: (2023)

Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs
by: Liu, Danni, et al.
Published: (2025)

Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024
by: Koneru, Sai, et al.
Published: (2024)

How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations
by: Lee, Hyunji, et al.
Published: (2024)

Language-Independent Representations Improve Zero-Shot Summarization
by: Solovyev, Vladimir, et al.
Published: (2024)

KIT's Low-resource Speech Translation Systems for IWSLT2025: System Enhancement with Synthetic Data and Model Regularization
by: Li, Zhaolin, et al.
Published: (2025)

In-context Language Learning for Endangered Languages in Speech Recognition
by: Li, Zhaolin, et al.
Published: (2025)

Contrastive Learning for Task-Independent SpeechLLM-Pretraining
by: Züfle, Maike, et al.
Published: (2024)

RASST: Fast Cross-modal Retrieval-Augmented Simultaneous Speech Translation
by: Luo, Jiaxuan, et al.
Published: (2026)

End-to-End Evaluation for Low-Latency Simultaneous Speech Translation
by: Huber, Christian, et al.
Published: (2023)

KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025
by: Koneru, Sai, et al.
Published: (2025)

OmniFusion: Simultaneous Multilingual Multimodal Translations via Modular Fusion
by: Koneru, Sai, et al.
Published: (2025)

Augmenting Automatic Speech Recognition Models with Disfluency Detection
by: Amann, Robin, et al.
Published: (2024)

Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech
by: Ouyang, Siqi, et al.
Published: (2026)

CMU's IWSLT 2025 Simultaneous Speech Translation System
by: Ouyang, Siqi, et al.
Published: (2025)

Plug, Play, and Fuse: Zero-Shot Joint Decoding via Word-Level Re-ranking Across Diverse Vocabularies
by: Koneru, Sai, et al.
Published: (2024)

Multimodal In-context Learning for ASR of Low-resource Languages
by: Li, Zhaolin, et al.
Published: (2026)

Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing
by: Koneru, Sai, et al.
Published: (2023)

When Helpful Context Leaks: Privacy Risks in Domain-Adapted ASR
by: Züfle, Maike, et al.
Published: (2026)

Improving Rare Word Translation With Dictionaries and Attention Masking
by: Sible, Kenneth J., et al.
Published: (2024)

Speech Editing -- a Summary
by: Kässmann, Tobias, et al.
Published: (2024)

Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
by: Sperber, Matthias, et al.
Published: (2024)

Early-Exit and Instant Confidence Translation Quality Estimation
by: Zouhar, Vilém, et al.
Published: (2025)

FASST: Fast LLM-based Simultaneous Speech Translation
by: Ouyang, Siqi, et al.
Published: (2024)

Do Slides Help? Multi-modal Context for Automatic Transcription of Conference Talks
by: Sinhamahapatra, Supriti, et al.
Published: (2025)

InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model
by: Ouyang, Siqi, et al.
Published: (2025)

SpeechQE: Estimating the Quality of Direct Speech Translation
by: Han, HyoJung, et al.
Published: (2024)

MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks
by: Papi, Sara, et al.
Published: (2025)

Talk2Ref: A Dataset for Reference Prediction from Scientific Talks
by: Broy, Frederik, et al.
Published: (2025)

CA*: Addressing Evaluation Pitfalls in Computation-Aware Latency for Simultaneous Speech Translation
by: Xu, Xi, et al.
Published: (2024)

Unveiling the Role of Pretraining in Direct Speech Translation
by: Alastruey, Belen, et al.
Published: (2024)

Direct Speech to Speech Translation: A Review
by: Sarim, Mohammad, et al.
Published: (2025)

Are Generative Models Underconfident? Better Quality Estimation with Boosted Model Probability
by: Dinh, Tu Anh, et al.
Published: (2025)

Sigmoid Head for Quality Estimation under Language Ambiguity
by: Dinh, Tu Anh, et al.
Published: (2026)

Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection
by: Kim, Yumin, et al.
Published: (2025)

TARQ: Tail-Aware Reconstruction Quantization for Rare-Word Robust Automatic Speech Recognition
by: Wang, Xinyu, et al.
Published: (2026)

Word Level Timestamp Generation for Automatic Speech Recognition and Translation
by: Hu, Ke, et al.
Published: (2025)

Simulstream: Open-Source Toolkit for Evaluation and Demonstration of Streaming Speech-to-Text Translation Systems
by: Gaido, Marco, et al.
Published: (2025)

MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations
by: Scott, Aaron, et al.
Published: (2025)