Saved in:
| Main Authors: | Pouransari, Hadi, Grangier, David, Thomas, C, Kirchhof, Michael, Tuzel, Oncel |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.02375 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MobileCLIP2: Improving Multi-Modal Reinforced Training
by: Faghri, Fartash, et al.
Published: (2025)
by: Faghri, Fartash, et al.
Published: (2025)
Learning to Reason for Hallucination Span Detection
by: Su, Hsuan, et al.
Published: (2025)
by: Su, Hsuan, et al.
Published: (2025)
CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)
Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization
by: Lu, Yen-Ju, et al.
Published: (2025)
by: Lu, Yen-Ju, et al.
Published: (2025)
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2023)
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2023)
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
by: Li, Jeffrey, et al.
Published: (2025)
by: Li, Jeffrey, et al.
Published: (2025)
TiC-CLIP: Continual Training of CLIP Models
by: Garg, Saurabh, et al.
Published: (2023)
by: Garg, Saurabh, et al.
Published: (2023)
Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents
by: Kirchhof, Michael, et al.
Published: (2025)
by: Kirchhof, Michael, et al.
Published: (2025)
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
by: Pouransari, Hadi, et al.
Published: (2024)
by: Pouransari, Hadi, et al.
Published: (2024)
Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
by: Öncel, Fırat, et al.
Published: (2024)
by: Öncel, Fırat, et al.
Published: (2024)
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
by: Mehta, Sachin, et al.
Published: (2024)
by: Mehta, Sachin, et al.
Published: (2024)
FastVLM: Efficient Vision Encoding for Vision Language Models
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
by: Hsieh, Cheng-Yu, et al.
Published: (2025)
by: Hsieh, Cheng-Yu, et al.
Published: (2025)
The Geometries of Truth Are Orthogonal Across Tasks
by: Azizian, Waiss, et al.
Published: (2025)
by: Azizian, Waiss, et al.
Published: (2025)
Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models
by: Vemulapalli, Raviteja, et al.
Published: (2023)
by: Vemulapalli, Raviteja, et al.
Published: (2023)
Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
by: Santilli, Andrea, et al.
Published: (2025)
by: Santilli, Andrea, et al.
Published: (2025)
SelfReflect: Can LLMs Communicate Their Internal Answer Distribution?
by: Kirchhof, Michael, et al.
Published: (2025)
by: Kirchhof, Michael, et al.
Published: (2025)
Uncertainties of Latent Representations in Computer Vision
by: Kirchhof, Michael
Published: (2024)
by: Kirchhof, Michael
Published: (2024)
BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design
by: Choudhury, Deepro, et al.
Published: (2025)
by: Choudhury, Deepro, et al.
Published: (2025)
Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining
by: Li, Jeffrey, et al.
Published: (2026)
by: Li, Jeffrey, et al.
Published: (2026)
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
by: Ablin, Pierre, et al.
Published: (2025)
by: Ablin, Pierre, et al.
Published: (2025)
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
by: Grangier, David, et al.
Published: (2024)
by: Grangier, David, et al.
Published: (2024)
Learning from Self Critique and Refinement for Faithful LLM Summarization
by: Hu, Ting-Yao, et al.
Published: (2025)
by: Hu, Ting-Yao, et al.
Published: (2025)
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
by: Mirzadeh, Iman, et al.
Published: (2024)
by: Mirzadeh, Iman, et al.
Published: (2024)
To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining
by: Singh, Karan, et al.
Published: (2026)
by: Singh, Karan, et al.
Published: (2026)
Pretraining Large Language Models with NVFP4
by: NVIDIA, et al.
Published: (2025)
by: NVIDIA, et al.
Published: (2025)
Pretrained Hybrids with MAD Skills
by: Roberts, Nicholas, et al.
Published: (2024)
by: Roberts, Nicholas, et al.
Published: (2024)
Alignment Pretraining: AI Discourse Causes Self-Fulfilling (Mis)alignment
by: Tice, Cameron, et al.
Published: (2026)
by: Tice, Cameron, et al.
Published: (2026)
Linguistic Blind Spots of Large Language Models
by: Cheng, Jiali, et al.
Published: (2025)
by: Cheng, Jiali, et al.
Published: (2025)
Tool Unlearning for Tool-Augmented LLMs
by: Cheng, Jiali, et al.
Published: (2025)
by: Cheng, Jiali, et al.
Published: (2025)
FairFlow: Mitigating Dataset Biases through Undecided Learning
by: Cheng, Jiali, et al.
Published: (2025)
by: Cheng, Jiali, et al.
Published: (2025)
TPTT: Transforming Pretrained Transformers into Titans
by: Furfaro, Fabien
Published: (2025)
by: Furfaro, Fabien
Published: (2025)
RLP: Reinforcement as a Pretraining Objective
by: Hatamizadeh, Ali, et al.
Published: (2025)
by: Hatamizadeh, Ali, et al.
Published: (2025)
Memorization Dynamics of Fill-in-the-Middle Pretraining
by: von Arx, Tobias, et al.
Published: (2026)
by: von Arx, Tobias, et al.
Published: (2026)
Fresh in memory: Training-order recency is linearly encoded in language model activations
by: Krasheninnikov, Dmitrii, et al.
Published: (2025)
by: Krasheninnikov, Dmitrii, et al.
Published: (2025)
Patent Language Model Pretraining with ModernBERT
by: Yousefiramandi, Amirhossein, et al.
Published: (2025)
by: Yousefiramandi, Amirhossein, et al.
Published: (2025)
Output Embedding Centering for Stable LLM Pretraining
by: Stollenwerk, Felix, et al.
Published: (2026)
by: Stollenwerk, Felix, et al.
Published: (2026)
Collaboratively adding new knowledge to an LLM
by: Lee, Rhui Dih, et al.
Published: (2024)
by: Lee, Rhui Dih, et al.
Published: (2024)
Frequency-Aware Masked Autoencoders for Multimodal Pretraining on Biosignals
by: Liu, Ran, et al.
Published: (2023)
by: Liu, Ran, et al.
Published: (2023)
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
by: Ali, Mehdi, et al.
Published: (2025)
by: Ali, Mehdi, et al.
Published: (2025)
Similar Items
-
MobileCLIP2: Improving Multi-Modal Reinforced Training
by: Faghri, Fartash, et al.
Published: (2025) -
Learning to Reason for Hallucination Span Detection
by: Su, Hsuan, et al.
Published: (2025) -
CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024) -
Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization
by: Lu, Yen-Ju, et al.
Published: (2025) -
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2023)