:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Tianjian, Xu, Haoran, Tan, Weiting, Murray, Kenton, Khashabi, Daniel
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2410.04579
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
by: Li, Tianjian, et al.
Published: (2023)

SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
by: Li, Tianjian, et al.
Published: (2025)

Jointly Reinforcing Diversity and Quality in Language Model Generations
by: Li, Tianjian, et al.
Published: (2025)

The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure
by: Bafna, Niyati, et al.
Published: (2025)

DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging
by: Verma, Neha, et al.
Published: (2025)

Merging Feed-Forward Sublayers for Compressed Transformers
by: Verma, Neha, et al.
Published: (2025)

Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher
by: Uzunoglu, Arda, et al.
Published: (2026)

The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks
by: Uzunoglu, Arda, et al.
Published: (2025)

IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning
by: Mishra, Aayush, et al.
Published: (2025)

Do pretrained Transformers Learn In-Context by Gradient Descent?
by: Shen, Lingfeng, et al.
Published: (2023)

Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
by: Zhang, Jingyu, et al.
Published: (2024)

SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards
by: Zhang, Dengjia, et al.
Published: (2026)

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
by: Zhou, Yuhang, et al.
Published: (2025)

Position: The Turing-Completeness of Autoregressive Transformers Relies Heavily on Context Management
by: Cui, Guanyu, et al.
Published: (2026)

DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation
by: Tan, Weiting, et al.
Published: (2024)

AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets
by: Lesci, Pietro, et al.
Published: (2024)

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
by: Xu, Haoran, et al.
Published: (2024)

k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text
by: Hou, Abe Bohan, et al.
Published: (2024)

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations
by: Liu, Wei, et al.
Published: (2026)

It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF
by: Lu, Taiming, et al.
Published: (2024)

SELF-[IN]CORRECT: LLMs Struggle with Discriminating Self-Generated Responses
by: Jiang, Dongwei, et al.
Published: (2024)

The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
by: Shen, Lingfeng, et al.
Published: (2024)

Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging
by: Ju, Yiming, et al.
Published: (2024)

SABER: Switchable and Balanced Training for Efficient LLM Reasoning
by: Zhao, Kai, et al.
Published: (2025)

RedPajama: an Open Dataset for Training Large Language Models
by: Weber, Maurice, et al.
Published: (2024)

Training Superior Sparse Autoencoders for Instruct Models
by: Li, Jiaming, et al.
Published: (2025)

Training a Huggingface Model on AWS Sagemaker (Without Tears)
by: Tan, Liling
Published: (2025)

Effective Reasoning Chains Reduce Intrinsic Dimensionality
by: Prasad, Archiki, et al.
Published: (2026)

Exploring Imbalanced Annotations for Effective In-Context Learning
by: Gao, Hongfu, et al.
Published: (2025)

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
by: Pouransari, Hadi, et al.
Published: (2024)

Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models
by: Ahmadi, Saba, et al.
Published: (2026)

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
by: Qiu, Zihan, et al.
Published: (2025)

Improving Diffusion Language Model Decoding through Joint Search in Generation Order and Token Space
by: Shen, Yangyi, et al.
Published: (2026)

MONA: Muon Optimizer with Nesterov Acceleration for Scalable Language Model Training
by: Li, Jiacheng, et al.
Published: (2026)

LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
by: Xia, Yunhui, et al.
Published: (2025)

Online Bayesian Imbalanced Learning with Bregman-Calibrated Deep Networks
by: Alsulaimawi, Zahir
Published: (2026)

Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training
by: Chung, Woojin, et al.
Published: (2025)

Data Augmentation for Classification of Negative Pregnancy Outcomes in Imbalanced Data
by: Biswas, Md Badsha
Published: (2025)

ALTA: Compiler-Based Analysis of Transformers
by: Shaw, Peter, et al.
Published: (2024)

BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
by: Rodriguez, Juan, et al.
Published: (2024)