:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Rao, Abhinav, Yerukola, Akhila, Shah, Vishwa, Reinecke, Katharina, Sap, Maarten
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2404.12464
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures
by: Yerukola, Akhila, et al.
Published: (2025)

Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMs
by: Yerukola, Akhila, et al.
Published: (2024)

Out of Style: RAG's Fragility to Linguistic Variation
by: Cao, Tianyu, et al.
Published: (2025)

PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
by: Kumar, Priyanshu, et al.
Published: (2025)

Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication
by: Shen, Jocelyn, et al.
Published: (2025)

Social World Models
by: Zhou, Xuhui, et al.
Published: (2025)

Ambig-SWE: Interactive Agents to Overcome Underspecificity in Software Engineering
by: Vijayvargiya, Sanidhya, et al.
Published: (2025)

Data Defenses Against Large Language Models
by: Agnew, William, et al.
Published: (2024)

Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks
by: Gao, Alice, et al.
Published: (2026)

Pre-Calc: Learning to Use the Calculator Improves Numeracy in Language Models
by: Veerendranath, Vishruth, et al.
Published: (2024)

From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models
by: Mendelsohn, Julia, et al.
Published: (2023)

PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models
by: Jain, Devansh, et al.
Published: (2024)

Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty
by: Zhou, Kaitlyn, et al.
Published: (2024)

Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas
by: Kwok, Louis, et al.
Published: (2024)

Measuring Social Norms of Large Language Models
by: Yuan, Ye, et al.
Published: (2024)

EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences
by: Ghate, Kshitish, et al.
Published: (2025)

Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
by: Baheti, Ashutosh, et al.
Published: (2023)

SocialGaze: Improving the Integration of Human Social Norms in Large Language Models
by: Vijjini, Anvesh Rao, et al.
Published: (2024)

Adaptable Logical Control for Large Language Models
by: Zhang, Honghua, et al.
Published: (2024)

Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance
by: Zhou, Kaitlyn, et al.
Published: (2024)

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
by: Jiang, Liwei, et al.
Published: (2024)

Not Like Us, Hunty: Measuring Perceptions and Behavioral Effects of Minoritized Anthropomorphic Cues in LLMs
by: Basoah, Jeffrey, et al.
Published: (2025)

Belief Revision: The Adaptability of Large Language Models Reasoning
by: Wilie, Bryan, et al.
Published: (2024)

Mitigating Bias in RAG: Controlling the Embedder
by: Kim, Taeyoun, et al.
Published: (2025)

Graph-Assisted Culturally Adaptable Idiomatic Translation for Indic Languages
by: Singh, Pratik Rakesh, et al.
Published: (2025)

Adaptable and Reliable Text Classification using Large Language Models
by: Wang, Zhiqiang, et al.
Published: (2024)

Stereotype or Personalization? User Identity Biases Chatbot Recommendations
by: Kantharuban, Anjali, et al.
Published: (2024)

Where Do People Tell Stories Online? Story Detection Across Online Communities
by: Antoniak, Maria, et al.
Published: (2023)

CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models
by: Wang, Yuhang, et al.
Published: (2023)

Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning
by: He, Zhonghao, et al.
Published: (2025)

Rejected Dialects: Biases Against African American Language in Reward Models
by: Mire, Joel, et al.
Published: (2025)

Building and Measuring Trust between Large Language Models
by: Buyl, Maarten, et al.
Published: (2025)

SA-MDKIF: A Scalable and Adaptable Medical Domain Knowledge Injection Framework for Large Language Models
by: Xu, Tianhan, et al.
Published: (2024)

VideoNorms: Benchmarking Cultural Awareness of Video Language Models
by: Varimalla, Nikhil Reddy, et al.
Published: (2025)

HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs
by: Shen, Jocelyn, et al.
Published: (2024)

Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
by: Jiang, Liwei, et al.
Published: (2025)

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
by: Mireshghallah, Niloofar, et al.
Published: (2023)

SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents
by: Wang, Ruiyi, et al.
Published: (2024)

Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
by: Zhou, Xuhui, et al.
Published: (2024)

Breaking mBad! Supervised Fine-tuning for Cross-Lingual Detoxification
by: Beniwal, Himanshu, et al.
Published: (2025)