Saved in:
| Main Author: | Grigorev, George |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.02522 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PhoneLM:an Efficient and Capable Small Language Model Family through Principled Pre-training
by: Yi, Rongjie, et al.
Published: (2024)
by: Yi, Rongjie, et al.
Published: (2024)
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
by: Samragh, Mohammad, et al.
Published: (2024)
by: Samragh, Mohammad, et al.
Published: (2024)
An Efficient Replay for Class-Incremental Learning with Pre-trained Models
by: Yin, Weimin, et al.
Published: (2024)
by: Yin, Weimin, et al.
Published: (2024)
Towards Efficient Pre-training: Exploring FP4 Precision in Large Language Models
by: Zhou, Jiecheng, et al.
Published: (2025)
by: Zhou, Jiecheng, et al.
Published: (2025)
Pre-trained Large Language Models Learn Hidden Markov Models In-context
by: Dai, Yijia, et al.
Published: (2025)
by: Dai, Yijia, et al.
Published: (2025)
Investigating Data Contamination for Pre-training Language Models
by: Jiang, Minhao, et al.
Published: (2024)
by: Jiang, Minhao, et al.
Published: (2024)
Aligning Pre-trained Models for Spoken Language Translation
by: Sedláček, Šimon, et al.
Published: (2024)
by: Sedláček, Šimon, et al.
Published: (2024)
Sequence-to-Sequence Spanish Pre-trained Language Models
by: Araujo, Vladimir, et al.
Published: (2023)
by: Araujo, Vladimir, et al.
Published: (2023)
Utilizing Strategic Pre-training to Reduce Overfitting: Baguan -- A Pre-trained Weather Forecasting Model
by: Niu, Peisong, et al.
Published: (2025)
by: Niu, Peisong, et al.
Published: (2025)
FGBERT: Function-Driven Pre-trained Gene Language Model for Metagenomics
by: Duan, ChenRui, et al.
Published: (2024)
by: Duan, ChenRui, et al.
Published: (2024)
Integrating Pre-trained Language Model into Neural Machine Translation
by: Hwang, Soon-Jae, et al.
Published: (2023)
by: Hwang, Soon-Jae, et al.
Published: (2023)
Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning
by: Mattos, João, et al.
Published: (2026)
by: Mattos, João, et al.
Published: (2026)
Feature Alignment: Rethinking Efficient Active Learning via Proxy in the Context of Pre-trained Models
by: Wen, Ziting, et al.
Published: (2024)
by: Wen, Ziting, et al.
Published: (2024)
Pre-training a Transformer-Based Generative Model Using a Small Sepedi Dataset
by: Ramalepe, Simon P., et al.
Published: (2025)
by: Ramalepe, Simon P., et al.
Published: (2025)
OSF: On Pre-training and Scaling of Sleep Foundation Models
by: Shuai, Zitao, et al.
Published: (2026)
by: Shuai, Zitao, et al.
Published: (2026)
Transfer Learning with Pre-trained Conditional Generative Models
by: Yamaguchi, Shin'ya, et al.
Published: (2022)
by: Yamaguchi, Shin'ya, et al.
Published: (2022)
Scaling Laws for Pre-training Agents and World Models
by: Pearce, Tim, et al.
Published: (2024)
by: Pearce, Tim, et al.
Published: (2024)
Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models
by: Schröder, Christopher, et al.
Published: (2024)
by: Schröder, Christopher, et al.
Published: (2024)
Simple and Scalable Strategies to Continually Pre-train Large Language Models
by: Ibrahim, Adam, et al.
Published: (2024)
by: Ibrahim, Adam, et al.
Published: (2024)
Pre-training Limited Memory Language Models with Internal and External Knowledge
by: Zhao, Linxi, et al.
Published: (2025)
by: Zhao, Linxi, et al.
Published: (2025)
Sparse is Enough in Fine-tuning Pre-trained Large Language Models
by: Song, Weixi, et al.
Published: (2023)
by: Song, Weixi, et al.
Published: (2023)
Automated Traffic Incident Response Plans using Generative Artificial Intelligence: Part 1 -- Building the Incident Response Benchmark
by: Grigorev, Artur, et al.
Published: (2025)
by: Grigorev, Artur, et al.
Published: (2025)
Machine Unlearning of Pre-trained Large Language Models
by: Yao, Jin, et al.
Published: (2024)
by: Yao, Jin, et al.
Published: (2024)
The Future of Large Language Model Pre-training is Federated
by: Sani, Lorenzo, et al.
Published: (2024)
by: Sani, Lorenzo, et al.
Published: (2024)
Revisiting Pre-trained Language Models for Vulnerability Detection
by: Li, Youpeng, et al.
Published: (2025)
by: Li, Youpeng, et al.
Published: (2025)
A Vision-Language Pre-training Model-Guided Approach for Mitigating Backdoor Attacks in Federated Learning
by: Gai, Keke, et al.
Published: (2025)
by: Gai, Keke, et al.
Published: (2025)
Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model
by: Jin, Jiarui, et al.
Published: (2025)
by: Jin, Jiarui, et al.
Published: (2025)
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
by: Lin, Xi Victoria, et al.
Published: (2024)
by: Lin, Xi Victoria, et al.
Published: (2024)
Graph Generative Pre-trained Transformer
by: Chen, Xiaohui, et al.
Published: (2025)
by: Chen, Xiaohui, et al.
Published: (2025)
HiFloat4 Format for Language Model Pre-training on Ascend NPUs
by: Taghian, Mehran, et al.
Published: (2026)
by: Taghian, Mehran, et al.
Published: (2026)
CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model
by: Chiang, Shang-Hsuan, et al.
Published: (2024)
by: Chiang, Shang-Hsuan, et al.
Published: (2024)
CASE: Efficient Curricular Data Pre-training for Building Assistive Psychology Expert Models
by: Harne, Sarthak, et al.
Published: (2024)
by: Harne, Sarthak, et al.
Published: (2024)
Heterogeneous Graph Pre-training Based Model for Secure and Efficient Prediction of Default Risk Propagation among Bond Issuers
by: Li, Xurui, et al.
Published: (2025)
by: Li, Xurui, et al.
Published: (2025)
Accelerating Reinforcement Learning Algorithms Convergence using Pre-trained Large Language Models as Tutors With Advice Reusing
by: Toral, Lukas, et al.
Published: (2025)
by: Toral, Lukas, et al.
Published: (2025)
Pre-trained Molecular Language Models with Random Functional Group Masking
by: Peng, Tianhao, et al.
Published: (2024)
by: Peng, Tianhao, et al.
Published: (2024)
Priming: Hybrid State Space Models From Pre-trained Transformers
by: Chattopadhyay, Aditya, et al.
Published: (2026)
by: Chattopadhyay, Aditya, et al.
Published: (2026)
A Pre-trained Data Deduplication Model based on Active Learning
by: Shi, Haochen, et al.
Published: (2023)
by: Shi, Haochen, et al.
Published: (2023)
Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only
by: Xiao, Wei, et al.
Published: (2025)
by: Xiao, Wei, et al.
Published: (2025)
Unleashing The Power of Pre-Trained Language Models for Irregularly Sampled Time Series
by: Zhang, Weijia, et al.
Published: (2024)
by: Zhang, Weijia, et al.
Published: (2024)
On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models
by: Farhat, Sean, et al.
Published: (2024)
by: Farhat, Sean, et al.
Published: (2024)
Similar Items
-
PhoneLM:an Efficient and Capable Small Language Model Family through Principled Pre-training
by: Yi, Rongjie, et al.
Published: (2024) -
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
by: Samragh, Mohammad, et al.
Published: (2024) -
An Efficient Replay for Class-Incremental Learning with Pre-trained Models
by: Yin, Weimin, et al.
Published: (2024) -
Towards Efficient Pre-training: Exploring FP4 Precision in Large Language Models
by: Zhou, Jiecheng, et al.
Published: (2025) -
Pre-trained Large Language Models Learn Hidden Markov Models In-context
by: Dai, Yijia, et al.
Published: (2025)