Saved in:
| Main Author: | Fauber, Ben |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.05616 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models
by: Fauber, Ben
Published: (2024)
by: Fauber, Ben
Published: (2024)
Learning the Latent Rules of a Game from Data: A Chess Story
by: Fauber, Ben
Published: (2024)
by: Fauber, Ben
Published: (2024)
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
by: Wang, Xinyi, et al.
Published: (2024)
by: Wang, Xinyi, et al.
Published: (2024)
IDEA Prune: An Integrated Enlarge-and-Prune Pipeline in Generative Language Model Pretraining
by: Li, Yixiao, et al.
Published: (2025)
by: Li, Yixiao, et al.
Published: (2025)
SPADE: Faster Drug Discovery by Learning from Sparse Data
by: Nandakumar, Rahul, et al.
Published: (2026)
by: Nandakumar, Rahul, et al.
Published: (2026)
LMFusion: Adapting Pretrained Language Models for Multimodal Generation
by: Shi, Weijia, et al.
Published: (2024)
by: Shi, Weijia, et al.
Published: (2024)
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models
by: Tang, Haoyu, et al.
Published: (2024)
by: Tang, Haoyu, et al.
Published: (2024)
Language and Experience: A Computational Model of Social Learning in Complex Tasks
by: Colas, Cédric, et al.
Published: (2025)
by: Colas, Cédric, et al.
Published: (2025)
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
by: Huet, Alexis, et al.
Published: (2025)
by: Huet, Alexis, et al.
Published: (2025)
Pretraining Large Language Models with NVFP4
by: NVIDIA, et al.
Published: (2025)
by: NVIDIA, et al.
Published: (2025)
FastDiSS: Few-step Match Many-step Diffusion Language Model on Sequence-to-Sequence Generation--Full Version
by: Nguyen-Cong, Dat, et al.
Published: (2026)
by: Nguyen-Cong, Dat, et al.
Published: (2026)
Patent Language Model Pretraining with ModernBERT
by: Yousefiramandi, Amirhossein, et al.
Published: (2025)
by: Yousefiramandi, Amirhossein, et al.
Published: (2025)
ClinicRealm: Re-evaluating Large Language Models with Conventional Machine Learning for Non-Generative Clinical Prediction Tasks
by: Zhu, Yinghao, et al.
Published: (2024)
by: Zhu, Yinghao, et al.
Published: (2024)
In-context Pretraining: Language Modeling Beyond Document Boundaries
by: Shi, Weijia, et al.
Published: (2023)
by: Shi, Weijia, et al.
Published: (2023)
Revisiting Multilingual Data Mixtures in Language Model Pretraining
by: Foroutan, Negar, et al.
Published: (2025)
by: Foroutan, Negar, et al.
Published: (2025)
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
by: Bayazit, Deniz, et al.
Published: (2023)
by: Bayazit, Deniz, et al.
Published: (2023)
Sequence-to-Sequence Spanish Pre-trained Language Models
by: Araujo, Vladimir, et al.
Published: (2023)
by: Araujo, Vladimir, et al.
Published: (2023)
FutureFill: Fast Generation from Convolutional Sequence Models
by: Agarwal, Naman, et al.
Published: (2024)
by: Agarwal, Naman, et al.
Published: (2024)
Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph Completion
by: Liu, Ben, et al.
Published: (2024)
by: Liu, Ben, et al.
Published: (2024)
GLiClass: Generalist Lightweight Model for Sequence Classification Tasks
by: Stepanov, Ihor, et al.
Published: (2025)
by: Stepanov, Ihor, et al.
Published: (2025)
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
by: McLeish, Sean, et al.
Published: (2025)
by: McLeish, Sean, et al.
Published: (2025)
Generating Pretraining Tokens from Organic Data for Data-Bound Scaling
by: Yu, Zichun, et al.
Published: (2026)
by: Yu, Zichun, et al.
Published: (2026)
LoRA-Augmented Generation (LAG) for Knowledge-Intensive Language Tasks
by: Fleshman, William, et al.
Published: (2025)
by: Fleshman, William, et al.
Published: (2025)
A Dual-Space Framework for General Knowledge Distillation of Large Language Models
by: Zhang, Xue, et al.
Published: (2025)
by: Zhang, Xue, et al.
Published: (2025)
DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
by: Zhou, Ying, et al.
Published: (2024)
by: Zhou, Ying, et al.
Published: (2024)
Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts
by: Somayajula, Sai Ashish, et al.
Published: (2024)
by: Somayajula, Sai Ashish, et al.
Published: (2024)
Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models
by: Huang, Yukun, et al.
Published: (2025)
by: Huang, Yukun, et al.
Published: (2025)
Tracing the Representation Geometry of Language Models from Pretraining to Post-training
by: Li, Melody Zixuan, et al.
Published: (2025)
by: Li, Melody Zixuan, et al.
Published: (2025)
The Mouth is Not the Brain: Bridging Energy-Based World Models and Language Generation
by: Niimi, Junichiro
Published: (2026)
by: Niimi, Junichiro
Published: (2026)
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
by: Hu, Mengkang, et al.
Published: (2024)
by: Hu, Mengkang, et al.
Published: (2024)
RLAC: Reinforcement Learning with Adversarial Critic for Free-Form Generation Tasks
by: Wu, Mian, et al.
Published: (2025)
by: Wu, Mian, et al.
Published: (2025)
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
by: Ali, Mehdi, et al.
Published: (2025)
by: Ali, Mehdi, et al.
Published: (2025)
ViCLSR: A Supervised Contrastive Learning Framework with Natural Language Inference for Natural Language Understanding Tasks
by: Van Huynh, Tin, et al.
Published: (2026)
by: Van Huynh, Tin, et al.
Published: (2026)
RuAG: Learned-rule-augmented Generation for Large Language Models
by: Zhang, Yudi, et al.
Published: (2024)
by: Zhang, Yudi, et al.
Published: (2024)
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
by: Agarwal, Rishabh, et al.
Published: (2023)
by: Agarwal, Rishabh, et al.
Published: (2023)
Instruct-Tuning Pretrained Causal Language Models for Ancient Greek Papyrology and Epigraphy
by: Cullhed, Eric
Published: (2024)
by: Cullhed, Eric
Published: (2024)
BiMix: A Bivariate Data Mixing Law for Language Model Pretraining
by: Ge, Ce, et al.
Published: (2024)
by: Ge, Ce, et al.
Published: (2024)
Unified Multi-Task Learning & Model Fusion for Efficient Language Model Guardrailing
by: Neill, James O', et al.
Published: (2025)
by: Neill, James O', et al.
Published: (2025)
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining
by: Xiaomi, LLM-Core, et al.
Published: (2025)
by: Xiaomi, LLM-Core, et al.
Published: (2025)
BARE: Leveraging Base Language Models for Few-Shot Synthetic Data Generation
by: Zhu, Alan, et al.
Published: (2025)
by: Zhu, Alan, et al.
Published: (2025)
Similar Items
-
Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models
by: Fauber, Ben
Published: (2024) -
Learning the Latent Rules of a Game from Data: A Chess Story
by: Fauber, Ben
Published: (2024) -
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
by: Wang, Xinyi, et al.
Published: (2024) -
IDEA Prune: An Integrated Enlarge-and-Prune Pipeline in Generative Language Model Pretraining
by: Li, Yixiao, et al.
Published: (2025) -
SPADE: Faster Drug Discovery by Learning from Sparse Data
by: Nandakumar, Rahul, et al.
Published: (2026)