:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fan, Simin, Grangier, David, Ablin, Pierre
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2410.02498
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
by: Grangier, David, et al.
Published: (2024)

Need a Small Specialized Language Model? Plan Early!
by: Grangier, David, et al.
Published: (2024)

Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
by: Ablin, Pierre, et al.
Published: (2025)

Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection
by: Bethune, Louis, et al.
Published: (2025)

The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
by: Saada, Thiziri Nait, et al.
Published: (2025)

Optimal Splitting of Language Models from Mixtures to Specialized Domains
by: Seto, Skyler, et al.
Published: (2026)

The AdEMAMix Optimizer: Better, Faster, Older
by: Pagliardini, Matteo, et al.
Published: (2024)

Scaling Laws for Mixture Pretraining Under Data Constraints
by: Sedova, Anastasiia, et al.
Published: (2026)

Nectar: Neural Estimation of Cached-Token Attention via Regression
by: Monteiro, João, et al.
Published: (2026)

Training Bilingual LMs with Data Constraints in the Targeted Language
by: Seto, Skyler, et al.
Published: (2024)

No Need to Talk: Asynchronous Mixture of Language Models
by: Filippova, Anastasiia, et al.
Published: (2024)

Scaling Laws for Optimal Data Mixtures
by: Shukor, Mustafa, et al.
Published: (2025)

Pretraining with hierarchical memories: separating long-tail and common knowledge
by: Pouransari, Hadi, et al.
Published: (2025)

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
by: Lu, Keming, et al.
Published: (2024)

The Geometries of Truth Are Orthogonal Across Tasks
by: Azizian, Waiss, et al.
Published: (2025)

DoGE: Domain Reweighting with Generalization Estimation
by: Fan, Simin, et al.
Published: (2023)

Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
by: Wu, Yongtao, et al.
Published: (2025)

Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
by: Sun, Guanglong, et al.
Published: (2026)

Why Is RLHF Alignment Shallow? A Gradient Analysis
by: Young, Robin
Published: (2026)

Robust Multi-Objective Preference Alignment with Online DPO
by: Gupta, Raghav, et al.
Published: (2025)

Joint Selection for Large-Scale Pre-Training Data via Policy Gradient-based Mask Learning
by: Fan, Ziqing, et al.
Published: (2025)

Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
by: Wang, Fei, et al.
Published: (2024)

DSPA: Dynamic SAE Steering for Data-Efficient Preference Alignment
by: Wedgwood, James, et al.
Published: (2026)

MixDPO: Modeling Preference Strength for Pluralistic Alignment
by: Imai, Saki, et al.
Published: (2026)

MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment
by: Wang, Tianze, et al.
Published: (2025)

Value Alignment from Unstructured Text
by: Padhi, Inkit, et al.
Published: (2024)

Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference Optimization
by: Yang, Junming, et al.
Published: (2025)

Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation
by: Chen, Simin, et al.
Published: (2025)

A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability
by: Cao, Chengtai, et al.
Published: (2022)

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
by: Yuan, Hui, et al.
Published: (2024)

Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
by: Abbes, Istabrak, et al.
Published: (2025)

Efficient Alignment of Large Language Models via Data Sampling
by: Khera, Amrit, et al.
Published: (2024)

SAIL: Self-Improving Efficient Online Alignment of Large Language Models
by: Ding, Mucong, et al.
Published: (2024)

Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting
by: Lu, Yining, et al.
Published: (2025)

Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
by: Rannen-Triki, Amal, et al.
Published: (2024)

Progressive Mixed-Precision Decoding for Efficient LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)

When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning
by: Deng, Mengyi, et al.
Published: (2025)

Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs
by: Gallego, Víctor
Published: (2024)

Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models
by: Li, Chengao, et al.
Published: (2025)

Alignment-Constrained Dynamic Pruning for LLMs: Identifying and Preserving Alignment-Critical Circuits
by: Patel, Dev, et al.
Published: (2025)