Saved in:
| Main Authors: | Li, Xiang Lisa, Kaiyom, Farzaan, Liu, Evan Zheran, Mai, Yifan, Liang, Percy, Hashimoto, Tatsunori |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.08351 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On the Learnability of Watermarks for Language Models
by: Gu, Chenchen, et al.
Published: (2023)
by: Gu, Chenchen, et al.
Published: (2023)
Auditing Prompt Caching in Language Model APIs
by: Gu, Chenchen, et al.
Published: (2025)
by: Gu, Chenchen, et al.
Published: (2025)
Robust Distortion-free Watermarks for Language Models
by: Kuditipudi, Rohith, et al.
Published: (2023)
by: Kuditipudi, Rohith, et al.
Published: (2023)
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
by: Dubois, Yann, et al.
Published: (2024)
by: Dubois, Yann, et al.
Published: (2024)
Eliciting Language Model Behaviors with Investigator Agents
by: Li, Xiang Lisa, et al.
Published: (2025)
by: Li, Xiang Lisa, et al.
Published: (2025)
ArenaBencher: Automatic Benchmark Evolution via Multi-Model Competitive Evaluation
by: Liu, Qin, et al.
Published: (2025)
by: Liu, Qin, et al.
Published: (2025)
s1: Simple test-time scaling
by: Muennighoff, Niklas, et al.
Published: (2025)
by: Muennighoff, Niklas, et al.
Published: (2025)
Language Models with Conformal Factuality Guarantees
by: Mohri, Christopher, et al.
Published: (2024)
by: Mohri, Christopher, et al.
Published: (2024)
Understanding Finetuning for Factual Knowledge Extraction
by: Ghosal, Gaurav, et al.
Published: (2024)
by: Ghosal, Gaurav, et al.
Published: (2024)
Improving Pretraining Data Using Perplexity Correlations
by: Thrush, Tristan, et al.
Published: (2024)
by: Thrush, Tristan, et al.
Published: (2024)
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
by: Dubois, Yann, et al.
Published: (2023)
by: Dubois, Yann, et al.
Published: (2023)
Evaluating Self-Supervised Learning via Risk Decomposition
by: Dubois, Yann, et al.
Published: (2023)
by: Dubois, Yann, et al.
Published: (2023)
Linguistic Calibration of Long-Form Generations
by: Band, Neil, et al.
Published: (2024)
by: Band, Neil, et al.
Published: (2024)
Observational Scaling Laws and the Predictability of Language Model Performance
by: Ruan, Yangjun, et al.
Published: (2024)
by: Ruan, Yangjun, et al.
Published: (2024)
Towards Execution-Grounded Automated AI Research
by: Si, Chenglei, et al.
Published: (2026)
by: Si, Chenglei, et al.
Published: (2026)
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
by: Si, Chenglei, et al.
Published: (2024)
by: Si, Chenglei, et al.
Published: (2024)
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas
by: Si, Chenglei, et al.
Published: (2025)
by: Si, Chenglei, et al.
Published: (2025)
Putting It All into Context: Simplifying Agents with LCLMs
by: Jiang, Mingjian, et al.
Published: (2025)
by: Jiang, Mingjian, et al.
Published: (2025)
Synthetic continued pretraining
by: Yang, Zitong, et al.
Published: (2024)
by: Yang, Zitong, et al.
Published: (2024)
Reasoning to Learn from Latent Thoughts
by: Ruan, Yangjun, et al.
Published: (2025)
by: Ruan, Yangjun, et al.
Published: (2025)
Pre-training under infinite compute
by: Kim, Konwoo, et al.
Published: (2025)
by: Kim, Konwoo, et al.
Published: (2025)
Agentic Adversarial QA for Improving Domain-Specific LLMs
by: Grari, Vincent, et al.
Published: (2026)
by: Grari, Vincent, et al.
Published: (2026)
Graph-based Uncertainty Metrics for Long-form Language Model Outputs
by: Jiang, Mingjian, et al.
Published: (2024)
by: Jiang, Mingjian, et al.
Published: (2024)
Out-of-Domain Robustness via Targeted Augmentations
by: Gao, Irena, et al.
Published: (2023)
by: Gao, Irena, et al.
Published: (2023)
Replaying pre-training data improves fine-tuning
by: Kotha, Suhas, et al.
Published: (2026)
by: Kotha, Suhas, et al.
Published: (2026)
Self-Verified Distillation: Your Language Model Is Secretly Its Own Synthetic Data Pipeline
by: Lee, Tony, et al.
Published: (2026)
by: Lee, Tony, et al.
Published: (2026)
AutoRAGTuner: A Declarative Framework for Automatic Optimization of RAG Pipelines
by: Zeng, Xintan, et al.
Published: (2026)
by: Zeng, Xintan, et al.
Published: (2026)
Bencher: Simple and Reproducible Benchmarking for Black-Box Optimization
by: Papenmeier, Leonard, et al.
Published: (2025)
by: Papenmeier, Leonard, et al.
Published: (2025)
Independence Tests for Language Models
by: Zhu, Sally, et al.
Published: (2025)
by: Zhu, Sally, et al.
Published: (2025)
CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition
by: Bartelds, Martijn, et al.
Published: (2025)
by: Bartelds, Martijn, et al.
Published: (2025)
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
by: Liu, Hong, et al.
Published: (2023)
by: Liu, Hong, et al.
Published: (2023)
On the Entropy Calibration of Language Models
by: Cao, Steven, et al.
Published: (2025)
by: Cao, Steven, et al.
Published: (2025)
CEQuest: Benchmarking Large Language Models for Construction Estimation
by: Wu, Yanzhao, et al.
Published: (2025)
by: Wu, Yanzhao, et al.
Published: (2025)
Synthetic Data for any Differentiable Target
by: Thrush, Tristan, et al.
Published: (2026)
by: Thrush, Tristan, et al.
Published: (2026)
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective
by: Wen, Kaiyue, et al.
Published: (2024)
by: Wen, Kaiyue, et al.
Published: (2024)
Temporal Entailment Pretraining for Clinical Language Models over EHR Data
by: Tanaka, Tatsunori, et al.
Published: (2025)
by: Tanaka, Tatsunori, et al.
Published: (2025)
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
by: Gong, Zixuan, et al.
Published: (2025)
by: Gong, Zixuan, et al.
Published: (2025)
Data-efficient pre-training by scaling synthetic megadocs
by: Kim, Konwoo, et al.
Published: (2026)
by: Kim, Konwoo, et al.
Published: (2026)
FSPO: Few-Shot Optimization of Synthetic Preferences Personalizes to Real Users
by: Singh, Anikait, et al.
Published: (2025)
by: Singh, Anikait, et al.
Published: (2025)
Knowledge Graph Construction in Power Distribution Networks
by: Li, Xiang, et al.
Published: (2023)
by: Li, Xiang, et al.
Published: (2023)
Similar Items
-
On the Learnability of Watermarks for Language Models
by: Gu, Chenchen, et al.
Published: (2023) -
Auditing Prompt Caching in Language Model APIs
by: Gu, Chenchen, et al.
Published: (2025) -
Robust Distortion-free Watermarks for Language Models
by: Kuditipudi, Rohith, et al.
Published: (2023) -
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
by: Dubois, Yann, et al.
Published: (2024) -
Eliciting Language Model Behaviors with Investigator Agents
by: Li, Xiang Lisa, et al.
Published: (2025)