Saved in:
| Main Authors: | Wang, Kuang-Da, Ding, Shuoyang, Yang, Chao-Han Huck, Hsieh, Ping-Chun, Peng, Wen-Chih, Lavrukhin, Vitaly, Ginsburg, Boris |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.17249 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech
by: Ouyang, Siqi, et al.
Published: (2026)
by: Ouyang, Siqi, et al.
Published: (2026)
EMMeTT: Efficient Multimodal Machine Translation Training
by: Żelasko, Piotr, et al.
Published: (2024)
by: Żelasko, Piotr, et al.
Published: (2024)
Chain-of-Thought Prompting for Speech Translation
by: Hu, Ke, et al.
Published: (2024)
by: Hu, Ke, et al.
Published: (2024)
Open Automatic Speech Recognition Models for Classical and Modern Standard Arabic
by: Grigoryan, Lilit, et al.
Published: (2025)
by: Grigoryan, Lilit, et al.
Published: (2025)
Anticipating Future with Large Language Model for Simultaneous Machine Translation
by: Ouyang, Siqi, et al.
Published: (2024)
by: Ouyang, Siqi, et al.
Published: (2024)
Imitation Learning of Correlated Policies in Stackelberg Games
by: Wang, Kuang-Da, et al.
Published: (2025)
by: Wang, Kuang-Da, et al.
Published: (2025)
TurboBias: Universal ASR Context-Biasing powered by GPU-accelerated Phrase-Boosting Tree
by: Andrusenko, Andrei, et al.
Published: (2025)
by: Andrusenko, Andrei, et al.
Published: (2025)
Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter
by: Andrusenko, Andrei, et al.
Published: (2024)
by: Andrusenko, Andrei, et al.
Published: (2024)
Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
by: Bataev, Vladimir, et al.
Published: (2023)
by: Bataev, Vladimir, et al.
Published: (2023)
Label-Looping: Highly Efficient Decoding for Transducers
by: Bataev, Vladimir, et al.
Published: (2024)
by: Bataev, Vladimir, et al.
Published: (2024)
Test-Time Alignment for Large Language Models via Textual Model Predictive Control
by: Wang, Kuang-Da, et al.
Published: (2025)
by: Wang, Kuang-Da, et al.
Published: (2025)
Granary: Speech Recognition and Translation Dataset in 25 European Languages
by: Koluguri, Nithin Rao, et al.
Published: (2025)
by: Koluguri, Nithin Rao, et al.
Published: (2025)
Offline Imitation of Badminton Player Behavior via Experiential Contexts and Brownian Motion
by: Wang, Kuang-Da, et al.
Published: (2024)
by: Wang, Kuang-Da, et al.
Published: (2024)
NGPU-LM: GPU-Accelerated N-Gram Language Model for Context-Biasing in Greedy ASR Decoding
by: Bataev, Vladimir, et al.
Published: (2025)
by: Bataev, Vladimir, et al.
Published: (2025)
FlexCTC: GPU-powered CTC Beam Decoding With Advanced Contextual Abilities
by: Grigoryan, Lilit, et al.
Published: (2025)
by: Grigoryan, Lilit, et al.
Published: (2025)
Pushing the Limits of Beam Search Decoding for Transducer-based ASR models
by: Grigoryan, Lilit, et al.
Published: (2025)
by: Grigoryan, Lilit, et al.
Published: (2025)
Reducing the Offline-Streaming Gap for Unified ASR Transducer with Consistency Regularization
by: Andrusenko, Andrei, et al.
Published: (2026)
by: Andrusenko, Andrei, et al.
Published: (2026)
Word Level Timestamp Generation for Automatic Speech Recognition and Translation
by: Hu, Ke, et al.
Published: (2025)
by: Hu, Ke, et al.
Published: (2025)
DDOT: A Derivative-directed Dual-decoder Ordinary Differential Equation Transformer for Dynamic System Modeling
by: Chang, Yang, et al.
Published: (2025)
by: Chang, Yang, et al.
Published: (2025)
A Chat About Boring Problems: Studying GPT-based text normalization
by: Zhang, Yang, et al.
Published: (2023)
by: Zhang, Yang, et al.
Published: (2023)
Methods to Increase the Amount of Data for Speech Recognition for Low Resource Languages
by: Ayrapetyan, Alexan, et al.
Published: (2025)
by: Ayrapetyan, Alexan, et al.
Published: (2025)
Unified Semi-Supervised Pipeline for Automatic Speech Recognition
by: Tadevosyan, Nune, et al.
Published: (2025)
by: Tadevosyan, Nune, et al.
Published: (2025)
BADGE: BADminton report Generation and Evaluation with LLM
by: Chiang, Shang-Hsuan, et al.
Published: (2024)
by: Chiang, Shang-Hsuan, et al.
Published: (2024)
Less is More: Accurate Speech Recognition & Translation without Web-Scale Data
by: Puvvada, Krishna C., et al.
Published: (2024)
by: Puvvada, Krishna C., et al.
Published: (2024)
nGPT: Normalized Transformer with Representation Learning on the Hypersphere
by: Loshchilov, Ilya, et al.
Published: (2024)
by: Loshchilov, Ilya, et al.
Published: (2024)
Investigating Length Issues in Document-level Machine Translation
by: Peng, Ziqian, et al.
Published: (2024)
by: Peng, Ziqian, et al.
Published: (2024)
Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains
by: Zouhar, Vilém, et al.
Published: (2024)
by: Zouhar, Vilém, et al.
Published: (2024)
TTS-Transducer: End-to-End Speech Synthesis with Neural Transducer
by: Bataev, Vladimir, et al.
Published: (2025)
by: Bataev, Vladimir, et al.
Published: (2025)
Pre-training Tensor-Train Networks Facilitates Machine Learning with Variational Quantum Circuits
by: Qi, Jun, et al.
Published: (2023)
by: Qi, Jun, et al.
Published: (2023)
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
by: Sun, Simeng, et al.
Published: (2025)
by: Sun, Simeng, et al.
Published: (2025)
Training and Inference Efficiency of Encoder-Decoder Speech Models
by: Żelasko, Piotr, et al.
Published: (2025)
by: Żelasko, Piotr, et al.
Published: (2025)
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
by: Hu, Yuchen, et al.
Published: (2024)
by: Hu, Yuchen, et al.
Published: (2024)
Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
by: Xu, Hainan, et al.
Published: (2024)
by: Xu, Hainan, et al.
Published: (2024)
Diminishing Exploration: A Minimalist Approach to Piecewise Stationary Multi-Armed Bandits
by: Li, Kuan-Ta, et al.
Published: (2024)
by: Li, Kuan-Ta, et al.
Published: (2024)
Secondary Stiefel-Whitney numbers and corresponding cobordism groups
by: Lavrukhin, Viktor
Published: (2025)
by: Lavrukhin, Viktor
Published: (2025)
OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models
by: Chen, William, et al.
Published: (2025)
by: Chen, William, et al.
Published: (2025)
APAR: Modeling Irregular Target Functions in Tabular Regression via Arithmetic-Aware Pre-Training and Adaptive-Regularized Fine-Tuning
by: Wu, Hong-Wei, et al.
Published: (2024)
by: Wu, Hong-Wei, et al.
Published: (2024)
NEWSAGENT: Benchmarking Multimodal Agents as Journalists with Real-World Newswriting Tasks
by: Chien, Yen-Che, et al.
Published: (2025)
by: Chien, Yen-Che, et al.
Published: (2025)
Mixture Experts with Test-Time Self-Supervised Aggregation for Tabular Imbalanced Regression
by: Wang, Yung-Chien, et al.
Published: (2025)
by: Wang, Yung-Chien, et al.
Published: (2025)
A Modularized Framework for Piecewise-Stationary Restless Bandits
by: Li, Kuan-Ta, et al.
Published: (2026)
by: Li, Kuan-Ta, et al.
Published: (2026)
Similar Items
-
Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech
by: Ouyang, Siqi, et al.
Published: (2026) -
EMMeTT: Efficient Multimodal Machine Translation Training
by: Żelasko, Piotr, et al.
Published: (2024) -
Chain-of-Thought Prompting for Speech Translation
by: Hu, Ke, et al.
Published: (2024) -
Open Automatic Speech Recognition Models for Classical and Modern Standard Arabic
by: Grigoryan, Lilit, et al.
Published: (2025) -
Anticipating Future with Large Language Model for Simultaneous Machine Translation
by: Ouyang, Siqi, et al.
Published: (2024)