Saved in:
| Main Authors: | Shaib, Chantal, Govindarajan, Venkata S., Barrow, Joe, Sun, Jiuding, Siu, Alexa F., Wallace, Byron C., Nenkova, Ani |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.00553 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How Much Annotation is Needed to Compare Summarization Models?
by: Shaib, Chantal, et al.
Published: (2024)
by: Shaib, Chantal, et al.
Published: (2024)
Measuring Lexical Diversity of Synthetic Data Generated through Fine-Grained Persona Prompting
by: Kambhatla, Gauri, et al.
Published: (2025)
by: Kambhatla, Gauri, et al.
Published: (2025)
Measuring AI "Slop" in Text
by: Shaib, Chantal, et al.
Published: (2025)
by: Shaib, Chantal, et al.
Published: (2025)
Detection and Measurement of Syntactic Templates in Generated Text
by: Shaib, Chantal, et al.
Published: (2024)
by: Shaib, Chantal, et al.
Published: (2024)
Who Taught You That? Tracing Teachers in Model Distillation
by: Wadhwa, Somin, et al.
Published: (2025)
by: Wadhwa, Somin, et al.
Published: (2025)
Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
by: Shaib, Chantal, et al.
Published: (2025)
by: Shaib, Chantal, et al.
Published: (2025)
SQLSpace: A Representation Space for Text-to-SQL to Discover and Mitigate Robustness Gaps
by: Srikanth, Neha, et al.
Published: (2025)
by: Srikanth, Neha, et al.
Published: (2025)
Faithfulness vs. Safety: Evaluating LLM Behavior Under Counterfactual Medical Evidence
by: Mo, Kaijie, et al.
Published: (2026)
by: Mo, Kaijie, et al.
Published: (2026)
Future Lens: Anticipating Subsequent Tokens from a Single Hidden State
by: Pal, Koyena, et al.
Published: (2023)
by: Pal, Koyena, et al.
Published: (2023)
Dark & Stormy: Modeling Humor in Sentences from the Bulwer-Lytton Fiction Contest
by: Govindarajan, Venkata S, et al.
Published: (2025)
by: Govindarajan, Venkata S, et al.
Published: (2025)
Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation
by: Ramprasad, Sanjana, et al.
Published: (2024)
by: Ramprasad, Sanjana, et al.
Published: (2024)
Open (Clinical) LLMs are Sensitive to Instruction Phrasings
by: Arroyo, Alberto Mario Ceballos, et al.
Published: (2024)
by: Arroyo, Alberto Mario Ceballos, et al.
Published: (2024)
Compared to What? Baselines and Metrics for Counterfactual Prompting
by: Yang, Zihao, et al.
Published: (2026)
by: Yang, Zihao, et al.
Published: (2026)
Can SAEs reveal and mitigate racial biases of LLMs in healthcare?
by: Ahsan, Hiba, et al.
Published: (2025)
by: Ahsan, Hiba, et al.
Published: (2025)
Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer in Prompt Tuning
by: Xie, Kaige, et al.
Published: (2023)
by: Xie, Kaige, et al.
Published: (2023)
Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways
by: Govindarajan, Venkata S, et al.
Published: (2023)
by: Govindarajan, Venkata S, et al.
Published: (2023)
Decomposing Generalization: Models of Generic, Habitual, and Episodic Statements
by: Govindarajan, Venkata Subrahmanyan, et al.
Published: (2019)
by: Govindarajan, Venkata Subrahmanyan, et al.
Published: (2019)
Measuring the inhomogeneous obscuration of agn with mid-infrared observations
by: M. Nenkova
Published: (2007)
by: M. Nenkova
Published: (2007)
Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study
by: Sun, Zhaoyue, et al.
Published: (2024)
by: Sun, Zhaoyue, et al.
Published: (2024)
Circuit Distillation
by: Wadhwa, Somin, et al.
Published: (2025)
by: Wadhwa, Somin, et al.
Published: (2025)
Revisiting Relation Extraction in the era of Large Language Models
by: Wadhwa, Somin, et al.
Published: (2023)
by: Wadhwa, Somin, et al.
Published: (2023)
Investigating Mysteries of CoT-Augmented Distillation
by: Wadhwa, Somin, et al.
Published: (2024)
by: Wadhwa, Somin, et al.
Published: (2024)
Augmenting Rating-Scale Measures with Text-Derived Items Using the Information-Determined Scoring (IDS) Framework
by: Watson, Joe, et al.
Published: (2025)
by: Watson, Joe, et al.
Published: (2025)
Counterfactual Probing for the Influence of Affect and Specificity on Intergroup Bias
by: Govindarajan, Venkata S, et al.
Published: (2023)
by: Govindarajan, Venkata S, et al.
Published: (2023)
VickreyFeedback: Cost-efficient Data Construction for Reinforcement Learning from Human Feedback
by: Zhang, Guoxi, et al.
Published: (2024)
by: Zhang, Guoxi, et al.
Published: (2024)
Do they mean 'us'? Interpreting Referring Expressions in Intergroup Bias
by: Govindarajan, Venkata S, et al.
Published: (2024)
by: Govindarajan, Venkata S, et al.
Published: (2024)
CommonForms: A Large, Diverse Dataset for Form Field Detection
by: Barrow, Joe
Published: (2025)
by: Barrow, Joe
Published: (2025)
Vector Arithmetic in Concept and Token Subspaces
by: Feucht, Sheridan, et al.
Published: (2025)
by: Feucht, Sheridan, et al.
Published: (2025)
Help! Need Advice on Identifying Advice
by: Govindarajan, Venkata Subrahmanyan, et al.
Published: (2020)
by: Govindarajan, Venkata Subrahmanyan, et al.
Published: (2020)
Multimodal QUD: Inquisitive Questions from Scientific Figures
by: Wu, Yating, et al.
Published: (2026)
by: Wu, Yating, et al.
Published: (2026)
Chain of Logic: Rule-Based Reasoning with Large Language Models
by: Servantez, Sergio, et al.
Published: (2024)
by: Servantez, Sergio, et al.
Published: (2024)
Automatically Extracting Numerical Results from Randomized Controlled Trials with Large Language Models
by: Yun, Hye Sun, et al.
Published: (2024)
by: Yun, Hye Sun, et al.
Published: (2024)
InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification
by: Trienes, Jan, et al.
Published: (2024)
by: Trienes, Jan, et al.
Published: (2024)
Don't Pay Attention, PLANT It: Pretraining Attention via Learning-to-Rank
by: Roy, Debjyoti Saha, et al.
Published: (2024)
by: Roy, Debjyoti Saha, et al.
Published: (2024)
How people talk about each other: Modeling Generalized Intergroup Bias and Emotion
by: Govindarajan, Venkata S, et al.
Published: (2022)
by: Govindarajan, Venkata S, et al.
Published: (2022)
Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains
by: Ramprasad, Sanjana, et al.
Published: (2024)
by: Ramprasad, Sanjana, et al.
Published: (2024)
Learning from Natural Language Explanations for Generalizable Entity Matching
by: Wadhwa, Somin, et al.
Published: (2024)
by: Wadhwa, Somin, et al.
Published: (2024)
Do Multi-Document Summarization Models Synthesize?
by: DeYoung, Jay, et al.
Published: (2023)
by: DeYoung, Jay, et al.
Published: (2023)
ttta: Tools for Temporal Text Analysis
by: Lange, Kai-Robin, et al.
Published: (2025)
by: Lange, Kai-Robin, et al.
Published: (2025)
SafePassage: High-Fidelity Information Extraction with Black Box LLMs
by: Barrow, Joe, et al.
Published: (2025)
by: Barrow, Joe, et al.
Published: (2025)
Similar Items
-
How Much Annotation is Needed to Compare Summarization Models?
by: Shaib, Chantal, et al.
Published: (2024) -
Measuring Lexical Diversity of Synthetic Data Generated through Fine-Grained Persona Prompting
by: Kambhatla, Gauri, et al.
Published: (2025) -
Measuring AI "Slop" in Text
by: Shaib, Chantal, et al.
Published: (2025) -
Detection and Measurement of Syntactic Templates in Generated Text
by: Shaib, Chantal, et al.
Published: (2024) -
Who Taught You That? Tracing Teachers in Model Distillation
by: Wadhwa, Somin, et al.
Published: (2025)