:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Jifan, Luo, Ziyue, Liu, Jia, Shroff, Ness, Nowak, Robert
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2410.02755
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU Clusters
by: Luo, Ziyue, et al.
Published: (2025)

Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
by: Ali, Mehdi, et al.
Published: (2025)

Constraint-Rectified Training for Efficient Chain-of-Thought
by: Wu, Qinhang, et al.
Published: (2026)

FIRM: Federated In-client Regularized Multi-objective Alignment for Large Language Models
by: Nourzad, Fatemeh, et al.
Published: (2025)

Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning?
by: Ju, Peizhong, et al.
Published: (2024)

Large Language Models Achieve Gold Medal Performance at the International Olympiad on Astronomy & Astrophysics (IOAA)
by: Pinheiro, Lucas Carrit Delgado, et al.
Published: (2025)

From Scores to Gibbs Correctors: Accelerating Uniform-Rate Discrete Diffusion Models
by: Liang, Yuchen, et al.
Published: (2026)

Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach
by: Liang, Yuchen, et al.
Published: (2024)

AHA: Human-Assisted Out-of-Distribution Generalization and Detection
by: Bai, Haoyue, et al.
Published: (2024)

Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
by: Matton, Katie, et al.
Published: (2025)

Benchmarking LLMs' Judgments with No Gold Standard
by: Xu, Shengwei, et al.
Published: (2024)

Detecting Pretraining Data from Large Language Models
by: Shi, Weijia, et al.
Published: (2023)

LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
by: Wang, Yiming, et al.
Published: (2025)

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining
by: Chen, Zhixun, et al.
Published: (2025)

The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
by: Saada, Thiziri Nait, et al.
Published: (2025)

Provable Last-Iterate Convergence for Multi-Objective Safe LLM Alignment via Optimistic Primal-Dual
by: Li, Yining, et al.
Published: (2026)

Sharp Convergence Rates for Masked Diffusion Models
by: Liang, Yuchen, et al.
Published: (2026)

Pretraining Large Language Models with NVFP4
by: NVIDIA, et al.
Published: (2025)

Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
by: Wang, Xinyi, et al.
Published: (2024)

SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors
by: Lyu, Bohan, et al.
Published: (2025)

Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers
by: Liang, Yuchen, et al.
Published: (2024)

Procedural Pretraining: Warming Up Language Models with Abstract Data
by: Jiang, Liangze, et al.
Published: (2026)

Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
by: Sam, Dylan, et al.
Published: (2025)

An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
by: Bhatt, Gantavya, et al.
Published: (2024)

Using Hallucinations to Bypass GPT4's Filter
by: Lemkin, Benjamin
Published: (2024)

On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey
by: Zhang, Meishan, et al.
Published: (2025)

Revisiting Multilingual Data Mixtures in Language Model Pretraining
by: Foroutan, Negar, et al.
Published: (2025)

Facts in Stats: Impacts of Pretraining Diversity on Language Model Generalization
by: Behnia, Tina, et al.
Published: (2025)

The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models
by: Bhaskar, Adithya, et al.
Published: (2024)

Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees
by: Liang, Yuchen, et al.
Published: (2025)

How to Find the Exact Pareto Front for Multi-Objective MDPs?
by: Li, Yining, et al.
Published: (2024)

Provably Efficient Multi-Objective Bandit Algorithms under Preference-Centric Customization
by: Cao, Linfeng, et al.
Published: (2025)

Data Mixing for Large Language Models Pretraining: A Survey and Outlook
by: Chen, Zhuo, et al.
Published: (2026)

Language Models Improve When Pretraining Data Matches Target Tasks
by: Mizrahi, David, et al.
Published: (2025)

Temporal Entailment Pretraining for Clinical Language Models over EHR Data
by: Tanaka, Tatsunori, et al.
Published: (2025)

Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
by: Kıcıman, Emre, et al.
Published: (2023)

Measuring and Reducing LLM Hallucination without Gold-Standard Answers
by: Wei, Jiaheng, et al.
Published: (2024)

Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ?
by: Chiaranaipanich, Jirat, et al.
Published: (2024)

On Training Data Influence of GPT Models
by: Chai, Yekun, et al.
Published: (2024)

Monitoring State Transitions in Markovian Systems with Sampling Cost
by: Saurav, Kumar, et al.
Published: (2025)