Saved in:
| Main Authors: | Li, Tianjian, Xu, Haoran, Tan, Weiting, Murray, Kenton, Khashabi, Daniel |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.04579 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
by: Li, Tianjian, et al.
Published: (2023)
by: Li, Tianjian, et al.
Published: (2023)
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
by: Li, Tianjian, et al.
Published: (2025)
by: Li, Tianjian, et al.
Published: (2025)
Jointly Reinforcing Diversity and Quality in Language Model Generations
by: Li, Tianjian, et al.
Published: (2025)
by: Li, Tianjian, et al.
Published: (2025)
The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure
by: Bafna, Niyati, et al.
Published: (2025)
by: Bafna, Niyati, et al.
Published: (2025)
DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging
by: Verma, Neha, et al.
Published: (2025)
by: Verma, Neha, et al.
Published: (2025)
Merging Feed-Forward Sublayers for Compressed Transformers
by: Verma, Neha, et al.
Published: (2025)
by: Verma, Neha, et al.
Published: (2025)
Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher
by: Uzunoglu, Arda, et al.
Published: (2026)
by: Uzunoglu, Arda, et al.
Published: (2026)
The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks
by: Uzunoglu, Arda, et al.
Published: (2025)
by: Uzunoglu, Arda, et al.
Published: (2025)
IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning
by: Mishra, Aayush, et al.
Published: (2025)
by: Mishra, Aayush, et al.
Published: (2025)
Do pretrained Transformers Learn In-Context by Gradient Descent?
by: Shen, Lingfeng, et al.
Published: (2023)
by: Shen, Lingfeng, et al.
Published: (2023)
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
by: Zhang, Jingyu, et al.
Published: (2024)
by: Zhang, Jingyu, et al.
Published: (2024)
SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards
by: Zhang, Dengjia, et al.
Published: (2026)
by: Zhang, Dengjia, et al.
Published: (2026)
DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
by: Zhou, Yuhang, et al.
Published: (2025)
by: Zhou, Yuhang, et al.
Published: (2025)
Position: The Turing-Completeness of Autoregressive Transformers Relies Heavily on Context Management
by: Cui, Guanyu, et al.
Published: (2026)
by: Cui, Guanyu, et al.
Published: (2026)
DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation
by: Tan, Weiting, et al.
Published: (2024)
by: Tan, Weiting, et al.
Published: (2024)
AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets
by: Lesci, Pietro, et al.
Published: (2024)
by: Lesci, Pietro, et al.
Published: (2024)
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
by: Xu, Haoran, et al.
Published: (2024)
by: Xu, Haoran, et al.
Published: (2024)
k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text
by: Hou, Abe Bohan, et al.
Published: (2024)
by: Hou, Abe Bohan, et al.
Published: (2024)
Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations
by: Liu, Wei, et al.
Published: (2026)
by: Liu, Wei, et al.
Published: (2026)
It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
by: Lu, Taiming, et al.
Published: (2024)
by: Lu, Taiming, et al.
Published: (2024)
SELF-[IN]CORRECT: LLMs Struggle with Discriminating Self-Generated Responses
by: Jiang, Dongwei, et al.
Published: (2024)
by: Jiang, Dongwei, et al.
Published: (2024)
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
by: Shen, Lingfeng, et al.
Published: (2024)
by: Shen, Lingfeng, et al.
Published: (2024)
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging
by: Ju, Yiming, et al.
Published: (2024)
by: Ju, Yiming, et al.
Published: (2024)
SABER: Switchable and Balanced Training for Efficient LLM Reasoning
by: Zhao, Kai, et al.
Published: (2025)
by: Zhao, Kai, et al.
Published: (2025)
RedPajama: an Open Dataset for Training Large Language Models
by: Weber, Maurice, et al.
Published: (2024)
by: Weber, Maurice, et al.
Published: (2024)
Training Superior Sparse Autoencoders for Instruct Models
by: Li, Jiaming, et al.
Published: (2025)
by: Li, Jiaming, et al.
Published: (2025)
Training a Huggingface Model on AWS Sagemaker (Without Tears)
by: Tan, Liling
Published: (2025)
by: Tan, Liling
Published: (2025)
Effective Reasoning Chains Reduce Intrinsic Dimensionality
by: Prasad, Archiki, et al.
Published: (2026)
by: Prasad, Archiki, et al.
Published: (2026)
Exploring Imbalanced Annotations for Effective In-Context Learning
by: Gao, Hongfu, et al.
Published: (2025)
by: Gao, Hongfu, et al.
Published: (2025)
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
by: Pouransari, Hadi, et al.
Published: (2024)
by: Pouransari, Hadi, et al.
Published: (2024)
Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models
by: Ahmadi, Saba, et al.
Published: (2026)
by: Ahmadi, Saba, et al.
Published: (2026)
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
by: Qiu, Zihan, et al.
Published: (2025)
by: Qiu, Zihan, et al.
Published: (2025)
Improving Diffusion Language Model Decoding through Joint Search in Generation Order and Token Space
by: Shen, Yangyi, et al.
Published: (2026)
by: Shen, Yangyi, et al.
Published: (2026)
MONA: Muon Optimizer with Nesterov Acceleration for Scalable Language Model Training
by: Li, Jiacheng, et al.
Published: (2026)
by: Li, Jiacheng, et al.
Published: (2026)
LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
by: Xia, Yunhui, et al.
Published: (2025)
by: Xia, Yunhui, et al.
Published: (2025)
Online Bayesian Imbalanced Learning with Bregman-Calibrated Deep Networks
by: Alsulaimawi, Zahir
Published: (2026)
by: Alsulaimawi, Zahir
Published: (2026)
Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training
by: Chung, Woojin, et al.
Published: (2025)
by: Chung, Woojin, et al.
Published: (2025)
Data Augmentation for Classification of Negative Pregnancy Outcomes in Imbalanced Data
by: Biswas, Md Badsha
Published: (2025)
by: Biswas, Md Badsha
Published: (2025)
ALTA: Compiler-Based Analysis of Transformers
by: Shaw, Peter, et al.
Published: (2024)
by: Shaw, Peter, et al.
Published: (2024)
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
by: Rodriguez, Juan, et al.
Published: (2024)
by: Rodriguez, Juan, et al.
Published: (2024)
Similar Items
-
Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
by: Li, Tianjian, et al.
Published: (2023) -
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
by: Li, Tianjian, et al.
Published: (2025) -
Jointly Reinforcing Diversity and Quality in Language Model Generations
by: Li, Tianjian, et al.
Published: (2025) -
The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure
by: Bafna, Niyati, et al.
Published: (2025) -
DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging
by: Verma, Neha, et al.
Published: (2025)