Saved in:
| Main Authors: | Mąka, Paweł, Semerci, Yusuf Can, Scholtes, Jan, Spanakis, Gerasimos |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.14031 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models
by: Mąka, Paweł, et al.
Published: (2024)
by: Mąka, Paweł, et al.
Published: (2024)
Sequence Shortening for Context-Aware Machine Translation
by: Mąka, Paweł, et al.
Published: (2024)
by: Mąka, Paweł, et al.
Published: (2024)
Cross-Modal Robustness Transfer (CMRT): Training Robust Speech Translation Models Using Adversarial Text
by: Issam, Abderrahmane, et al.
Published: (2026)
by: Issam, Abderrahmane, et al.
Published: (2026)
Fixed and Adaptive Simultaneous Machine Translation Strategies Using Adapters
by: Issam, Abderrahmane, et al.
Published: (2024)
by: Issam, Abderrahmane, et al.
Published: (2024)
DTW-Align: Bridging the Modality Gap in End-to-End Speech Translation with Dynamic Time Warping Alignment
by: Issam, Abderrahmane, et al.
Published: (2025)
by: Issam, Abderrahmane, et al.
Published: (2025)
Language Models as Artificial Learners: Investigating Crosslinguistic Influence
by: Issam, Abderrahmane, et al.
Published: (2026)
by: Issam, Abderrahmane, et al.
Published: (2026)
A Representation Level Analysis of NMT Model Robustness to Grammatical Errors
by: Issam, Abderrahmane, et al.
Published: (2025)
by: Issam, Abderrahmane, et al.
Published: (2025)
Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning
by: Thil, Lucas-Andreï, et al.
Published: (2024)
by: Thil, Lucas-Andreï, et al.
Published: (2024)
More Compute Is What You Need
by: Guo, Zhen
Published: (2024)
by: Guo, Zhen
Published: (2024)
DIDS: Domain Impact-aware Data Sampling for Large Language Model Training
by: Shi, Weijie, et al.
Published: (2025)
by: Shi, Weijie, et al.
Published: (2025)
Did You Forget What I Asked? Prospective Memory Failures in Large Language Models
by: Mittal, Avni
Published: (2026)
by: Mittal, Avni
Published: (2026)
What Would You Ask When You First Saw $a^2+b^2=c^2$? Evaluating LLM on Curiosity-Driven Questioning
by: Javaji, Shashidhar Reddy, et al.
Published: (2024)
by: Javaji, Shashidhar Reddy, et al.
Published: (2024)
Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers
by: Pelofske, Elijah, et al.
Published: (2024)
by: Pelofske, Elijah, et al.
Published: (2024)
Forget What You Know about LLMs Evaluations -- LLMs are Like a Chameleon
by: Cohen-Inger, Nurit, et al.
Published: (2025)
by: Cohen-Inger, Nurit, et al.
Published: (2025)
Planning and Editing What You Retrieve for Enhanced Tool Learning
by: Huang, Tenghao, et al.
Published: (2024)
by: Huang, Tenghao, et al.
Published: (2024)
Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025)
by: Guo, Yiduo, et al.
Published: (2025)
Regurgitative Training: The Value of Real Data in Training Large Language Models
by: Zhang, Jinghui, et al.
Published: (2024)
by: Zhang, Jinghui, et al.
Published: (2024)
You Only Train Once: Differentiable Subset Selection for Omics Data
by: Chopard, Daphné, et al.
Published: (2025)
by: Chopard, Daphné, et al.
Published: (2025)
More Agents Is All You Need
by: Li, Junyou, et al.
Published: (2024)
by: Li, Junyou, et al.
Published: (2024)
Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training
by: Guo, Qingyan, et al.
Published: (2024)
by: Guo, Qingyan, et al.
Published: (2024)
ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning
by: Zeng, Xingshan, et al.
Published: (2025)
by: Zeng, Xingshan, et al.
Published: (2025)
APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models
by: Guan, Ziyi, et al.
Published: (2024)
by: Guan, Ziyi, et al.
Published: (2024)
Many-to-English Machine Translation Tools, Data, and Pretrained Models
by: Gowda, Thamme, et al.
Published: (2021)
by: Gowda, Thamme, et al.
Published: (2021)
You Can Generate It Again: Data-to-Text Generation with Verification and Correction Prompting
by: Ren, Xuan, et al.
Published: (2023)
by: Ren, Xuan, et al.
Published: (2023)
Reasoning with Sampling: Your Base Model is Smarter Than You Think
by: Karan, Aayush, et al.
Published: (2025)
by: Karan, Aayush, et al.
Published: (2025)
Should You Use Your Large Language Model to Explore or Exploit?
by: Harris, Keegan, et al.
Published: (2025)
by: Harris, Keegan, et al.
Published: (2025)
Tensor Product Attention Is All You Need
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
LoRA Is Slower Than You Think
by: Ko, Seokmin
Published: (2025)
by: Ko, Seokmin
Published: (2025)
Attention Smoothing Is All You Need For Unlearning
by: Zade, Saleh Zare, et al.
Published: (2026)
by: Zade, Saleh Zare, et al.
Published: (2026)
Fast Training Dataset Attribution via In-Context Learning
by: Fotouhi, Milad, et al.
Published: (2024)
by: Fotouhi, Milad, et al.
Published: (2024)
Data Efficacy for Language Model Training
by: Dai, Yalun, et al.
Published: (2025)
by: Dai, Yalun, et al.
Published: (2025)
Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training
by: Kesgin, H. Toprak, et al.
Published: (2024)
by: Kesgin, H. Toprak, et al.
Published: (2024)
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
by: Xiao, Chaojun, et al.
Published: (2024)
by: Xiao, Chaojun, et al.
Published: (2024)
Guidance is All You Need: Temperature-Guided Reasoning in Large Language Models
by: Gomaa, Eyad, et al.
Published: (2024)
by: Gomaa, Eyad, et al.
Published: (2024)
SynthDST: Synthetic Data is All You Need for Few-Shot Dialog State Tracking
by: Kulkarni, Atharva, et al.
Published: (2024)
by: Kulkarni, Atharva, et al.
Published: (2024)
SAEs Are Good for Steering -- If You Select the Right Features
by: Arad, Dana, et al.
Published: (2025)
by: Arad, Dana, et al.
Published: (2025)
Training and Evaluating Language Models with Template-based Data Generation
by: Zhang, Yifan
Published: (2024)
by: Zhang, Yifan
Published: (2024)
Does Training on Synthetic Data Make Models Less Robust?
by: Zhang, Lingze, et al.
Published: (2025)
by: Zhang, Lingze, et al.
Published: (2025)
Is Child-Directed Speech Effective Training Data for Language Models?
by: Feng, Steven Y., et al.
Published: (2024)
by: Feng, Steven Y., et al.
Published: (2024)
Optimisation Is Not What You Need
by: Ibias, Alfredo
Published: (2025)
by: Ibias, Alfredo
Published: (2025)
Similar Items
-
Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models
by: Mąka, Paweł, et al.
Published: (2024) -
Sequence Shortening for Context-Aware Machine Translation
by: Mąka, Paweł, et al.
Published: (2024) -
Cross-Modal Robustness Transfer (CMRT): Training Robust Speech Translation Models Using Adversarial Text
by: Issam, Abderrahmane, et al.
Published: (2026) -
Fixed and Adaptive Simultaneous Machine Translation Strategies Using Adapters
by: Issam, Abderrahmane, et al.
Published: (2024) -
DTW-Align: Bridging the Modality Gap in End-to-End Speech Translation with Dynamic Time Warping Alignment
by: Issam, Abderrahmane, et al.
Published: (2025)