Saved in:
| Main Authors: | Wu, Yu-Hang, Liu, Qin-Yuan, Zhao, Qiu-Yang, Jiang, Bo, Yang, Jiang-Feng, Cong, Qing-Wei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.11416 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs
by: Hong, Yun, et al.
Published: (2026)
by: Hong, Yun, et al.
Published: (2026)
LoopRPT: Reinforcement Pre-Training for Looped Language Models
by: Tang, Guo, et al.
Published: (2026)
by: Tang, Guo, et al.
Published: (2026)
Analysing The Impact of Sequence Composition on Language Model Pre-Training
by: Zhao, Yu, et al.
Published: (2024)
by: Zhao, Yu, et al.
Published: (2024)
Reinforcement Learning on Pre-Training Data
by: Li, Siheng, et al.
Published: (2025)
by: Li, Siheng, et al.
Published: (2025)
PowLU: An Activation Function for Stable Pre-Training of LLMs
by: Jiang, Peijie, et al.
Published: (2026)
by: Jiang, Peijie, et al.
Published: (2026)
Continual Pre-Training is (not) What You Need in Domain Adaption
by: Chen, Pin-Er, et al.
Published: (2025)
by: Chen, Pin-Er, et al.
Published: (2025)
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training
by: Zhuang, Yuchen, et al.
Published: (2025)
by: Zhuang, Yuchen, et al.
Published: (2025)
Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale
by: Zheng, Wenzhen, et al.
Published: (2024)
by: Zheng, Wenzhen, et al.
Published: (2024)
Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization
by: Hajimolahoseini, Habib, et al.
Published: (2023)
by: Hajimolahoseini, Habib, et al.
Published: (2023)
Evolution of Concepts in Language Model Pre-Training
by: Ge, Xuyang, et al.
Published: (2025)
by: Ge, Xuyang, et al.
Published: (2025)
Sparse Growing Transformer: Training-Time Sparse Depth Allocation via Progressive Attention Looping
by: Chen, Yao, et al.
Published: (2026)
by: Chen, Yao, et al.
Published: (2026)
Reinforcement Pre-Training
by: Dong, Qingxiu, et al.
Published: (2025)
by: Dong, Qingxiu, et al.
Published: (2025)
Autoregressive Pre-Training on Pixels and Texts
by: Chai, Yekun, et al.
Published: (2024)
by: Chai, Yekun, et al.
Published: (2024)
MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability
by: Wu, Weiqi, et al.
Published: (2025)
by: Wu, Weiqi, et al.
Published: (2025)
Revise, Don't Freeze: Sampler-Matched Training for Self-Correcting Masked Diffusion Language Models
by: Yu, Longxuan, et al.
Published: (2026)
by: Yu, Longxuan, et al.
Published: (2026)
AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality
by: Qing, Peijun, et al.
Published: (2024)
by: Qing, Peijun, et al.
Published: (2024)
Structured First-Layer Initialization Pre-Training Techniques to Accelerate Training Process Based on $\varepsilon$-Rank
by: Tang, Tao, et al.
Published: (2025)
by: Tang, Tao, et al.
Published: (2025)
Automating Legal Interpretation with LLMs: Retrieval, Generation, and Evaluation
by: Luo, Kangcheng, et al.
Published: (2025)
by: Luo, Kangcheng, et al.
Published: (2025)
Improving Language Models Trained on Translated Data with Continual Pre-Training and Dictionary Learning Analysis
by: Boughorbel, Sabri, et al.
Published: (2024)
by: Boughorbel, Sabri, et al.
Published: (2024)
MultiGPrompt for Multi-Task Pre-Training and Prompting on Graphs
by: Yu, Xingtong, et al.
Published: (2023)
by: Yu, Xingtong, et al.
Published: (2023)
SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values
by: Sun, Chengwei, et al.
Published: (2024)
by: Sun, Chengwei, et al.
Published: (2024)
Unifying Structured Data as Graph for Data-to-Text Pre-Training
by: Li, Shujie, et al.
Published: (2024)
by: Li, Shujie, et al.
Published: (2024)
Chain of Methodologies: Scaling Test Time Computation without Training
by: Liu, Cong, et al.
Published: (2025)
by: Liu, Cong, et al.
Published: (2025)
Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models
by: Lu, Yuheng, et al.
Published: (2024)
by: Lu, Yuheng, et al.
Published: (2024)
Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples
by: Yu, Fangxu, et al.
Published: (2024)
by: Yu, Fangxu, et al.
Published: (2024)
SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization
by: Cho, Sangwoo, et al.
Published: (2024)
by: Cho, Sangwoo, et al.
Published: (2024)
ILT-Iterative LoRA Training through Focus-Feedback-Fix for Multilingual Speech Recognition
by: Meng, Qingliang, et al.
Published: (2025)
by: Meng, Qingliang, et al.
Published: (2025)
Synthetic Pre-Pre-Training Improves Language Model Robustness to Noisy Pre-Training Data
by: Guo, Xu, et al.
Published: (2026)
by: Guo, Xu, et al.
Published: (2026)
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection
by: Chen, Tianxiang, et al.
Published: (2024)
by: Chen, Tianxiang, et al.
Published: (2024)
Domain-Adaptive Continued Pre-Training of Small Language Models
by: Faroz, Salman
Published: (2025)
by: Faroz, Salman
Published: (2025)
Integrating Pre-Trained Language Model with Physical Layer Communications
by: Lee, Ju-Hyung, et al.
Published: (2024)
by: Lee, Ju-Hyung, et al.
Published: (2024)
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
by: Uppaal, Rheeya, et al.
Published: (2024)
by: Uppaal, Rheeya, et al.
Published: (2024)
Pre-Trained Policy Discriminators are General Reward Models
by: Dou, Shihan, et al.
Published: (2025)
by: Dou, Shihan, et al.
Published: (2025)
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
by: Que, Haoran, et al.
Published: (2024)
by: Que, Haoran, et al.
Published: (2024)
Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers
by: Pan, Haowen, et al.
Published: (2023)
by: Pan, Haowen, et al.
Published: (2023)
NAN: A Training-Free Solution to Coefficient Estimation in Model Merging
by: Si, Chongjie, et al.
Published: (2025)
by: Si, Chongjie, et al.
Published: (2025)
How Do Large Language Models Learn Concepts During Continual Pre-Training?
by: Yao, Barry Menglong, et al.
Published: (2026)
by: Yao, Barry Menglong, et al.
Published: (2026)
Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks
by: Yang, Kang, et al.
Published: (2025)
by: Yang, Kang, et al.
Published: (2025)
Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
by: Chen, Yifang, et al.
Published: (2024)
by: Chen, Yifang, et al.
Published: (2024)
Beyond Fixed Length: Bucket Pre-training is All You Need
by: Yang, Qing, et al.
Published: (2024)
by: Yang, Qing, et al.
Published: (2024)
Similar Items
-
FreezeEmpath: Efficient Training for Empathetic Spoken Chatbots with Frozen LLMs
by: Hong, Yun, et al.
Published: (2026) -
LoopRPT: Reinforcement Pre-Training for Looped Language Models
by: Tang, Guo, et al.
Published: (2026) -
Analysing The Impact of Sequence Composition on Language Model Pre-Training
by: Zhao, Yu, et al.
Published: (2024) -
Reinforcement Learning on Pre-Training Data
by: Li, Siheng, et al.
Published: (2025) -
PowLU: An Activation Function for Stable Pre-Training of LLMs
by: Jiang, Peijie, et al.
Published: (2026)