:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Takase, Sho, Kiyono, Shun, Kobayashi, Sosuke, Suzuki, Jun
Format:	Preprint
Published:	2023
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2312.16903
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning
by: Yano, Kazuki, et al.
Published: (2026)

Efficient Construction of Model Family through Progressive Training Using Model Expansion
by: Yano, Kazuki, et al.
Published: (2025)

Self-Translate-Train: Enhancing Cross-Lingual Transfer of Large Language Models via Inherent Capability
by: Ri, Ryokan, et al.
Published: (2024)

Large Vocabulary Size Improves Large Language Models
by: Takase, Sho, et al.
Published: (2024)

Natural Fingerprints of Large Language Models
by: Suzuki, Teppei, et al.
Published: (2025)

Revisiting the Capacity Gap in Chain-of-Thought Distillation from a Practical Perspective
by: Kajitsuka, Tokio, et al.
Published: (2026)

Pre-trained Large Language Models for Financial Sentiment Analysis
by: Luo, Wei, et al.
Published: (2024)

Understanding Data Temporality Impact on Large Language Models Pre-training
by: Pilchen, Hippolyte, et al.
Published: (2026)

DataMan: Data Manager for Pre-training Large Language Models
by: Peng, Ru, et al.
Published: (2025)

Pre-training Distillation for Large Language Models: A Design Space Exploration
by: Peng, Hao, et al.
Published: (2024)

Efficient Language Adaptive Pre-training: Extending State-of-the-Art Large Language Models for Polish
by: Ruciński, Szymon
Published: (2024)

SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models
by: Arora, Samir, et al.
Published: (2024)

Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models
by: Qian, Chen, et al.
Published: (2024)

Probing Language Models for Pre-training Data Detection
by: Liu, Zhenhua, et al.
Published: (2024)

MELT: Materials-aware Continued Pre-training for Language Model Adaptation to Materials Science
by: Kim, Junho, et al.
Published: (2024)

Sparse is Enough in Fine-tuning Pre-trained Large Language Models
by: Song, Weixi, et al.
Published: (2023)

Simple and Scalable Strategies to Continually Pre-train Large Language Models
by: Ibrahim, Adam, et al.
Published: (2024)

Machine Unlearning of Pre-trained Large Language Models
by: Yao, Jin, et al.
Published: (2024)

Synthesize-on-Graph: Knowledgeable Synthetic Data Generation for Continue Pre-training of Large Language Models
by: Ma, Shengjie, et al.
Published: (2025)

PreCog: Exploring the Relation between Memorization and Performance in Pre-trained Language Models
by: Ranaldi, Leonardo, et al.
Published: (2023)

Can Pre-trained Language Models Understand Chinese Humor?
by: Chen, Yuyan, et al.
Published: (2024)

How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
by: Lv, Kangtao, et al.
Published: (2025)

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
by: Samragh, Mohammad, et al.
Published: (2024)

Find Parent then Label Children: A Two-stage Taxonomy Completion Method with Pre-trained Language Model
by: Xia, Fei, et al.
Published: (2024)

From N-grams to Pre-trained Multilingual Models For Language Identification
by: Sindane, Thapelo, et al.
Published: (2024)

Boosting Explainability through Selective Rationalization in Pre-trained Language Models
by: Yuan, Libing, et al.
Published: (2025)

RegMix: Data Mixture as Regression for Language Model Pre-training
by: Liu, Qian, et al.
Published: (2024)

Blacks is to Anger as Whites is to Joy? Understanding Latent Affective Bias in Large Pre-trained Neural Language Models
by: Kadan, Anoop, et al.
Published: (2023)

DocMamba: Efficient Document Pre-training with State Space Model
by: Hu, Pengfei, et al.
Published: (2024)

More Women, Same Stereotypes: Unpacking the Gender Bias Paradox in Large Language Models
by: Chen, Evan, et al.
Published: (2025)

Sequence-to-Sequence Spanish Pre-trained Language Models
by: Araujo, Vladimir, et al.
Published: (2023)

Investigating Data Contamination for Pre-training Language Models
by: Jiang, Minhao, et al.
Published: (2024)

Aligning Pre-trained Models for Spoken Language Translation
by: Sedláček, Šimon, et al.
Published: (2024)

Efficient Data Learning for Open Information Extraction with Pre-trained Language Models
by: Fan, Zhiyuan, et al.
Published: (2023)

Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models
by: Tang, Lei, et al.
Published: (2025)

More is More: Addition Bias in Large Language Models
by: Santagata, Luca, et al.
Published: (2024)

Superpixel Semantics Representation and Pre-training for Vision-Language Task
by: Zhang, Siyu, et al.
Published: (2023)

Zero-Shot Spam Email Classification Using Pre-trained Large Language Models
by: Rojas-Galeano, Sergio
Published: (2024)

Refactoring Programs Using Large Language Models with Few-Shot Examples
by: Shirafuji, Atsushi, et al.
Published: (2023)

Topic Over Source: The Key to Effective Data Mixing for Language Models Pre-training
by: Peng, Jiahui, et al.
Published: (2025)