Saved in:
| Main Authors: | Fan, Simin, Grangier, David, Ablin, Pierre |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.02498 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
by: Grangier, David, et al.
Published: (2024)
by: Grangier, David, et al.
Published: (2024)
Need a Small Specialized Language Model? Plan Early!
by: Grangier, David, et al.
Published: (2024)
by: Grangier, David, et al.
Published: (2024)
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
by: Ablin, Pierre, et al.
Published: (2025)
by: Ablin, Pierre, et al.
Published: (2025)
Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection
by: Bethune, Louis, et al.
Published: (2025)
by: Bethune, Louis, et al.
Published: (2025)
The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
by: Saada, Thiziri Nait, et al.
Published: (2025)
by: Saada, Thiziri Nait, et al.
Published: (2025)
Optimal Splitting of Language Models from Mixtures to Specialized Domains
by: Seto, Skyler, et al.
Published: (2026)
by: Seto, Skyler, et al.
Published: (2026)
The AdEMAMix Optimizer: Better, Faster, Older
by: Pagliardini, Matteo, et al.
Published: (2024)
by: Pagliardini, Matteo, et al.
Published: (2024)
Scaling Laws for Mixture Pretraining Under Data Constraints
by: Sedova, Anastasiia, et al.
Published: (2026)
by: Sedova, Anastasiia, et al.
Published: (2026)
Nectar: Neural Estimation of Cached-Token Attention via Regression
by: Monteiro, João, et al.
Published: (2026)
by: Monteiro, João, et al.
Published: (2026)
Training Bilingual LMs with Data Constraints in the Targeted Language
by: Seto, Skyler, et al.
Published: (2024)
by: Seto, Skyler, et al.
Published: (2024)
No Need to Talk: Asynchronous Mixture of Language Models
by: Filippova, Anastasiia, et al.
Published: (2024)
by: Filippova, Anastasiia, et al.
Published: (2024)
Scaling Laws for Optimal Data Mixtures
by: Shukor, Mustafa, et al.
Published: (2025)
by: Shukor, Mustafa, et al.
Published: (2025)
Pretraining with hierarchical memories: separating long-tail and common knowledge
by: Pouransari, Hadi, et al.
Published: (2025)
by: Pouransari, Hadi, et al.
Published: (2025)
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
by: Lu, Keming, et al.
Published: (2024)
by: Lu, Keming, et al.
Published: (2024)
The Geometries of Truth Are Orthogonal Across Tasks
by: Azizian, Waiss, et al.
Published: (2025)
by: Azizian, Waiss, et al.
Published: (2025)
DoGE: Domain Reweighting with Generalization Estimation
by: Fan, Simin, et al.
Published: (2023)
by: Fan, Simin, et al.
Published: (2023)
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
by: Wu, Yongtao, et al.
Published: (2025)
by: Wu, Yongtao, et al.
Published: (2025)
Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
by: Sun, Guanglong, et al.
Published: (2026)
by: Sun, Guanglong, et al.
Published: (2026)
Why Is RLHF Alignment Shallow? A Gradient Analysis
by: Young, Robin
Published: (2026)
by: Young, Robin
Published: (2026)
Robust Multi-Objective Preference Alignment with Online DPO
by: Gupta, Raghav, et al.
Published: (2025)
by: Gupta, Raghav, et al.
Published: (2025)
Joint Selection for Large-Scale Pre-Training Data via Policy Gradient-based Mask Learning
by: Fan, Ziqing, et al.
Published: (2025)
by: Fan, Ziqing, et al.
Published: (2025)
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
by: Wang, Fei, et al.
Published: (2024)
by: Wang, Fei, et al.
Published: (2024)
DSPA: Dynamic SAE Steering for Data-Efficient Preference Alignment
by: Wedgwood, James, et al.
Published: (2026)
by: Wedgwood, James, et al.
Published: (2026)
MixDPO: Modeling Preference Strength for Pluralistic Alignment
by: Imai, Saki, et al.
Published: (2026)
by: Imai, Saki, et al.
Published: (2026)
MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment
by: Wang, Tianze, et al.
Published: (2025)
by: Wang, Tianze, et al.
Published: (2025)
Value Alignment from Unstructured Text
by: Padhi, Inkit, et al.
Published: (2024)
by: Padhi, Inkit, et al.
Published: (2024)
Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference Optimization
by: Yang, Junming, et al.
Published: (2025)
by: Yang, Junming, et al.
Published: (2025)
Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation
by: Chen, Simin, et al.
Published: (2025)
by: Chen, Simin, et al.
Published: (2025)
A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability
by: Cao, Chengtai, et al.
Published: (2022)
by: Cao, Chengtai, et al.
Published: (2022)
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
by: Yuan, Hui, et al.
Published: (2024)
by: Yuan, Hui, et al.
Published: (2024)
Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
by: Abbes, Istabrak, et al.
Published: (2025)
by: Abbes, Istabrak, et al.
Published: (2025)
Efficient Alignment of Large Language Models via Data Sampling
by: Khera, Amrit, et al.
Published: (2024)
by: Khera, Amrit, et al.
Published: (2024)
SAIL: Self-Improving Efficient Online Alignment of Large Language Models
by: Ding, Mucong, et al.
Published: (2024)
by: Ding, Mucong, et al.
Published: (2024)
Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting
by: Lu, Yining, et al.
Published: (2025)
by: Lu, Yining, et al.
Published: (2025)
Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
by: Rannen-Triki, Amal, et al.
Published: (2024)
by: Rannen-Triki, Amal, et al.
Published: (2024)
Progressive Mixed-Precision Decoding for Efficient LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)
by: Chen, Hao Mark, et al.
Published: (2024)
When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning
by: Deng, Mengyi, et al.
Published: (2025)
by: Deng, Mengyi, et al.
Published: (2025)
Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs
by: Gallego, Víctor
Published: (2024)
by: Gallego, Víctor
Published: (2024)
Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models
by: Li, Chengao, et al.
Published: (2025)
by: Li, Chengao, et al.
Published: (2025)
Alignment-Constrained Dynamic Pruning for LLMs: Identifying and Preserving Alignment-Critical Circuits
by: Patel, Dev, et al.
Published: (2025)
by: Patel, Dev, et al.
Published: (2025)
Similar Items
-
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
by: Grangier, David, et al.
Published: (2024) -
Need a Small Specialized Language Model? Plan Early!
by: Grangier, David, et al.
Published: (2024) -
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
by: Ablin, Pierre, et al.
Published: (2025) -
Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection
by: Bethune, Louis, et al.
Published: (2025) -
The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
by: Saada, Thiziri Nait, et al.
Published: (2025)