:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pouransari, Hadi, Grangier, David, Thomas, C, Kirchhof, Michael, Tuzel, Oncel
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2510.02375
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MobileCLIP2: Improving Multi-Modal Reinforced Training
by: Faghri, Fartash, et al.
Published: (2025)

Learning to Reason for Hallucination Span Detection
by: Su, Hsuan, et al.
Published: (2025)

CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)

Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization
by: Lu, Yen-Ju, et al.
Published: (2025)

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2023)

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
by: Li, Jeffrey, et al.
Published: (2025)

TiC-CLIP: Continual Training of CLIP Models
by: Garg, Saurabh, et al.
Published: (2023)

Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents
by: Kirchhof, Michael, et al.
Published: (2025)

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
by: Pouransari, Hadi, et al.
Published: (2024)

Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
by: Öncel, Fırat, et al.
Published: (2024)

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
by: Mehta, Sachin, et al.
Published: (2024)

FastVLM: Efficient Vision Encoding for Vision Language Models
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)

FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
by: Hsieh, Cheng-Yu, et al.
Published: (2025)

The Geometries of Truth Are Orthogonal Across Tasks
by: Azizian, Waiss, et al.
Published: (2025)

Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models
by: Vemulapalli, Raviteja, et al.
Published: (2023)

Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
by: Santilli, Andrea, et al.
Published: (2025)

SelfReflect: Can LLMs Communicate Their Internal Answer Distribution?
by: Kirchhof, Michael, et al.
Published: (2025)

Uncertainties of Latent Representations in Computer Vision
by: Kirchhof, Michael
Published: (2024)

BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design
by: Choudhury, Deepro, et al.
Published: (2025)

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining
by: Li, Jeffrey, et al.
Published: (2026)

Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
by: Ablin, Pierre, et al.
Published: (2025)

Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
by: Grangier, David, et al.
Published: (2024)

Learning from Self Critique and Refinement for Faithful LLM Summarization
by: Hu, Ting-Yao, et al.
Published: (2025)

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
by: Mirzadeh, Iman, et al.
Published: (2024)

To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining
by: Singh, Karan, et al.
Published: (2026)

Pretraining Large Language Models with NVFP4
by: NVIDIA, et al.
Published: (2025)

Pretrained Hybrids with MAD Skills
by: Roberts, Nicholas, et al.
Published: (2024)

Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment
by: Tice, Cameron, et al.
Published: (2026)

Linguistic Blind Spots of Large Language Models
by: Cheng, Jiali, et al.
Published: (2025)

Tool Unlearning for Tool-Augmented LLMs
by: Cheng, Jiali, et al.
Published: (2025)

FairFlow: Mitigating Dataset Biases through Undecided Learning
by: Cheng, Jiali, et al.
Published: (2025)

TPTT: Transforming Pretrained Transformers into Titans
by: Furfaro, Fabien
Published: (2025)

RLP: Reinforcement as a Pretraining Objective
by: Hatamizadeh, Ali, et al.
Published: (2025)

Memorization Dynamics of Fill-in-the-Middle Pretraining
by: von Arx, Tobias, et al.
Published: (2026)

Fresh in memory: Training-order recency is linearly encoded in language model activations
by: Krasheninnikov, Dmitrii, et al.
Published: (2025)

Patent Language Model Pretraining with ModernBERT
by: Yousefiramandi, Amirhossein, et al.
Published: (2025)

Output Embedding Centering for Stable LLM Pretraining
by: Stollenwerk, Felix, et al.
Published: (2026)

Collaboratively adding new knowledge to an LLM
by: Lee, Rhui Dih, et al.
Published: (2024)

Frequency-Aware Masked Autoencoders for Multimodal Pretraining on Biosignals
by: Liu, Ran, et al.
Published: (2023)

Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
by: Ali, Mehdi, et al.
Published: (2025)