Saved in:
| Main Authors: | Pang, Richard Yuanzhe, Roller, Stephen, Cho, Kyunghyun, He, He, Weston, Jason |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2307.14117 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Iterative Reasoning Preference Optimization
by: Pang, Richard Yuanzhe, et al.
Published: (2024)
by: Pang, Richard Yuanzhe, et al.
Published: (2024)
System-Level Natural Language Feedback
by: Yuan, Weizhe, et al.
Published: (2023)
by: Yuan, Weizhe, et al.
Published: (2023)
Self-Rewarding Language Models
by: Yuan, Weizhe, et al.
Published: (2024)
by: Yuan, Weizhe, et al.
Published: (2024)
Show Your Work with Confidence: Confidence Bands for Tuning Curves
by: Lourie, Nicholas, et al.
Published: (2023)
by: Lourie, Nicholas, et al.
Published: (2023)
Hyperparameter Loss Surfaces Are Simple Near their Optima
by: Lourie, Nicholas, et al.
Published: (2025)
by: Lourie, Nicholas, et al.
Published: (2025)
Following Length Constraints in Instructions
by: Yuan, Weizhe, et al.
Published: (2024)
by: Yuan, Weizhe, et al.
Published: (2024)
Self-Consistency Preference Optimization
by: Prasad, Archiki, et al.
Published: (2024)
by: Prasad, Archiki, et al.
Published: (2024)
Self-Taught Evaluators
by: Wang, Tianlu, et al.
Published: (2024)
by: Wang, Tianlu, et al.
Published: (2024)
Generalization Measures for Zero-Shot Cross-Lingual Transfer
by: Bassi, Saksham, et al.
Published: (2024)
by: Bassi, Saksham, et al.
Published: (2024)
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
by: Yuan, Weizhe, et al.
Published: (2025)
by: Yuan, Weizhe, et al.
Published: (2025)
Multi-modal Data Spectrum: Multi-modal Datasets are Multi-dimensional
by: Madaan, Divyam, et al.
Published: (2025)
by: Madaan, Divyam, et al.
Published: (2025)
Learning from Implicit User Feedback, Emotions and Demographic Information in Task-Oriented and Document-Grounded Dialogues
by: Petrak, Dominic, et al.
Published: (2024)
by: Petrak, Dominic, et al.
Published: (2024)
Efficient semantic uncertainty quantification in language models via diversity-steered sampling
by: Park, Ji Won, et al.
Published: (2025)
by: Park, Ji Won, et al.
Published: (2025)
Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling
by: Madaan, Divyam, et al.
Published: (2026)
by: Madaan, Divyam, et al.
Published: (2026)
Temporal Generalization: A Reality Check
by: Madaan, Divyam, et al.
Published: (2025)
by: Madaan, Divyam, et al.
Published: (2025)
Training Language Models with Language Feedback at Scale
by: Scheurer, Jérémy, et al.
Published: (2023)
by: Scheurer, Jérémy, et al.
Published: (2023)
IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data
by: Peng, Bo, et al.
Published: (2025)
by: Peng, Bo, et al.
Published: (2025)
First Tragedy, then Parse: History Repeats Itself in the New Era of Large Language Models
by: Saphra, Naomi, et al.
Published: (2023)
by: Saphra, Naomi, et al.
Published: (2023)
Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check
by: Lourie, Nicholas, et al.
Published: (2025)
by: Lourie, Nicholas, et al.
Published: (2025)
On the Relationship Between the Choice of Representation and In-Context Learning
by: Marinescu, Ioana, et al.
Published: (2025)
by: Marinescu, Ioana, et al.
Published: (2025)
Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback
by: Lee, Dong Won, et al.
Published: (2024)
by: Lee, Dong Won, et al.
Published: (2024)
An Overview of Large Language Models for Statisticians
by: Ji, Wenlong, et al.
Published: (2025)
by: Ji, Wenlong, et al.
Published: (2025)
MentalAgora: A Gateway to Advanced Personalized Care in Mental Health through Multi-Agent Debating and Attribute Control
by: Lee, Yeonji, et al.
Published: (2024)
by: Lee, Yeonji, et al.
Published: (2024)
Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs
by: Chen, Angelica, et al.
Published: (2023)
by: Chen, Angelica, et al.
Published: (2023)
Training Dialogue Systems by AI Feedback for Improving Overall Dialogue Impression
by: Yoshida, Kai, et al.
Published: (2025)
by: Yoshida, Kai, et al.
Published: (2025)
Language Models as Causal Effect Generators
by: Bynum, Lucius E. J., et al.
Published: (2024)
by: Bynum, Lucius E. J., et al.
Published: (2024)
Code-Switching In-Context Learning for Cross-Lingual Transfer of Large Language Models
by: Yoo, Haneul, et al.
Published: (2025)
by: Yoo, Haneul, et al.
Published: (2025)
Leveraging LLMs for Dialogue Quality Measurement
by: Jia, Jinghan, et al.
Published: (2024)
by: Jia, Jinghan, et al.
Published: (2024)
Transformers Struggle to Learn to Search
by: Saparov, Abulhair, et al.
Published: (2024)
by: Saparov, Abulhair, et al.
Published: (2024)
UNO-DST: Leveraging Unlabelled Data in Zero-Shot Dialogue State Tracking
by: Li, Chuang, et al.
Published: (2023)
by: Li, Chuang, et al.
Published: (2023)
Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time
by: Hu, Michael Y., et al.
Published: (2026)
by: Hu, Michael Y., et al.
Published: (2026)
Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning
by: Madaan, Divyam, et al.
Published: (2024)
by: Madaan, Divyam, et al.
Published: (2024)
Shared Heritage, Distinct Writing: Rethinking Resource Selection for East Asian Historical Documents
by: Song, Seyoung, et al.
Published: (2024)
by: Song, Seyoung, et al.
Published: (2024)
HERITAGE: An End-to-End Web Platform for Processing Korean Historical Documents in Hanja
by: Song, Seyoung, et al.
Published: (2025)
by: Song, Seyoung, et al.
Published: (2025)
The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models
by: Kirsanov, Artem, et al.
Published: (2025)
by: Kirsanov, Artem, et al.
Published: (2025)
AERIC: Anticipatory Hidden-State Monitoring for Implicit Harmful Dialogue
by: Park, Jihyung, et al.
Published: (2026)
by: Park, Jihyung, et al.
Published: (2026)
Aioli: A Unified Optimization Framework for Language Model Data Mixing
by: Chen, Mayee F., et al.
Published: (2024)
by: Chen, Mayee F., et al.
Published: (2024)
Multi-Token Attention
by: Golovneva, Olga, et al.
Published: (2025)
by: Golovneva, Olga, et al.
Published: (2025)
BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation
by: He, Yuhong, et al.
Published: (2024)
by: He, Yuhong, et al.
Published: (2024)
Semi-Supervised Dialogue Abstractive Summarization via High-Quality Pseudolabel Selection
by: He, Jianfeng, et al.
Published: (2024)
by: He, Jianfeng, et al.
Published: (2024)
Similar Items
-
Iterative Reasoning Preference Optimization
by: Pang, Richard Yuanzhe, et al.
Published: (2024) -
System-Level Natural Language Feedback
by: Yuan, Weizhe, et al.
Published: (2023) -
Self-Rewarding Language Models
by: Yuan, Weizhe, et al.
Published: (2024) -
Show Your Work with Confidence: Confidence Bands for Tuning Curves
by: Lourie, Nicholas, et al.
Published: (2023) -
Hyperparameter Loss Surfaces Are Simple Near their Optima
by: Lourie, Nicholas, et al.
Published: (2025)