:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Yiding, Zhou, Allan, Feng, Zhili, Malladi, Sadhika, Kolter, J. Zico
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2410.11820
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
by: Malladi, Sadhika, et al.
Published: (2022)

An Axiomatic Approach to Model-Agnostic Concept Explanations
by: Feng, Zhili, et al.
Published: (2024)

Looking beyond the next token
by: Thankaraj, Abitha, et al.
Published: (2025)

Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic
by: Goyal, Sachin, et al.
Published: (2024)

LESS: Selecting Influential Data for Targeted Instruction Tuning
by: Xia, Mengzhou, et al.
Published: (2024)

TOFU: A Task of Fictitious Unlearning for LLMs
by: Maini, Pratyush, et al.
Published: (2024)

Rethinking LLM Memorization through the Lens of Adversarial Compression
by: Schwarzschild, Avi, et al.
Published: (2024)

In Good GRACEs: Principled Teacher Selection for Knowledge Distillation
by: Panigrahi, Abhishek, et al.
Published: (2025)

FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers
by: Williams, Joshua Nathaniel, et al.
Published: (2024)

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence
by: Finzi, Marc, et al.
Published: (2026)

Provable unlearning in topic modeling and downstream tasks
by: Wei, Stanley, et al.
Published: (2024)

Trainable Transformer in Transformer
by: Panigrahi, Abhishek, et al.
Published: (2023)

Measuring Five-Nines Reliability: Sample-Efficient LLM Evaluation in Saturated Benchmarks
by: Kim, Eungyeup, et al.
Published: (2026)

AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUs
by: Akinwande, Victor, et al.
Published: (2024)

Mimetic Initialization of MLPs
by: Trockman, Asher, et al.
Published: (2026)

Progressive distillation induces an implicit curriculum
by: Panigrahi, Abhishek, et al.
Published: (2024)

Why is SAM Robust to Label Noise?
by: Baek, Christina, et al.
Published: (2024)

Equilibrium Reasoners: Learning Attractors Enables Scalable Reasoning
by: Huang, Benhao, et al.
Published: (2026)

Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
by: Razin, Noam, et al.
Published: (2024)

The Marginal Value of Momentum for Small Learning Rate SGD
by: Wang, Runzhe, et al.
Published: (2023)

Predicting the Performance of Black-box LLMs through Follow-up Queries
by: Sam, Dylan, et al.
Published: (2025)

Training a Generally Curious Agent
by: Tajwar, Fahim, et al.
Published: (2025)

Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
by: Xu, Yixuan Even, et al.
Published: (2025)

Diffusing Differentiable Representations
by: Savani, Yash, et al.
Published: (2024)

One-Step Diffusion Distillation via Deep Equilibrium Models
by: Geng, Zhengyang, et al.
Published: (2023)

Prompt Recovery for Image Generation Models: A Comparative Study of Discrete Optimizers
by: Williams, Joshua Nathaniel, et al.
Published: (2024)

Generative Posterior Networks for Approximately Bayesian Epistemic Uncertainty Estimation
by: Roderick, Melrose, et al.
Published: (2023)

Existing Large Language Model Unlearning Evaluations Are Inconclusive
by: Feng, Zhili, et al.
Published: (2025)

Context-Parametric Inversion: Why Instruction Finetuning Can Worsen Context Reliance
by: Goyal, Sachin, et al.
Published: (2024)

Massive Activations in Large Language Models
by: Sun, Mingjie, et al.
Published: (2024)

The Mixing method: low-rank coordinate descent for semidefinite programming with diagonal constraints
by: Wang, Po-Wei, et al.
Published: (2017)

Rethinking Distance Metrics for Counterfactual Explainability
by: Williams, Joshua Nathaniel, et al.
Published: (2024)

Safety Pretraining: Toward the Next Generation of Safe AI
by: Maini, Pratyush, et al.
Published: (2025)

A Simple and Effective Pruning Approach for Large Language Models
by: Sun, Mingjie, et al.
Published: (2023)

Understanding Optimization in Deep Learning with Central Flows
by: Cohen, Jeremy M., et al.
Published: (2024)

Understanding Hallucinations in Diffusion Models through Mode Interpolation
by: Aithal, Sumukh K, et al.
Published: (2024)

Evaluating Language Model Reasoning about Confidential Information
by: Sam, Dylan, et al.
Published: (2025)

When Should We Introduce Safety Interventions During Pretraining?
by: Sam, Dylan, et al.
Published: (2026)

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
by: Zhai, Runtian, et al.
Published: (2023)

Finetuning CLIP to Reason about Pairwise Differences
by: Sam, Dylan, et al.
Published: (2024)