Saved in:
| Main Authors: | Yu, Cheng, Stroebl, Benedikt, Yang, Diyi, Papakyriakopoulos, Orestis |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.14215 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How Should AI Safety Benchmarks Benchmark Safety?
by: Yu, Cheng, et al.
Published: (2026)
by: Yu, Cheng, et al.
Published: (2026)
Position: Measure Dataset Diversity, Don't Just Claim It
by: Zhao, Dora, et al.
Published: (2024)
by: Zhao, Dora, et al.
Published: (2024)
Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech Generators
by: Hutiri, Wiebke, et al.
Published: (2024)
by: Hutiri, Wiebke, et al.
Published: (2024)
Engaged AI Governance: Addressing the Last Mile Challenge Through Internal Expert Collaboration
by: Jarvers, Simon, et al.
Published: (2026)
by: Jarvers, Simon, et al.
Published: (2026)
AI Adoption Across Mission-Driven Organizations
by: Ali, Dalia, et al.
Published: (2025)
by: Ali, Dalia, et al.
Published: (2025)
Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce
by: Shao, Yijia, et al.
Published: (2025)
by: Shao, Yijia, et al.
Published: (2025)
Operationalizing Pluralistic Values in Large Language Model Alignment Reveals Trade-offs in Safety, Inclusivity, and Model Behavior
by: Ali, Dalia, et al.
Published: (2025)
by: Ali, Dalia, et al.
Published: (2025)
SWE-chat: Coding Agent Interactions From Real Users in the Wild
by: Baumann, Joachim, et al.
Published: (2026)
by: Baumann, Joachim, et al.
Published: (2026)
SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery
by: Anugraha, David, et al.
Published: (2026)
by: Anugraha, David, et al.
Published: (2026)
Unintended Impacts of LLM Alignment on Global Representation
by: Ryan, Michael J., et al.
Published: (2024)
by: Ryan, Michael J., et al.
Published: (2024)
Mapping the Spiral of Silence: Surveying Unspoken Opinions in Online Communities
by: Zhao, Dora, et al.
Published: (2025)
by: Zhao, Dora, et al.
Published: (2025)
Sycophantic AI makes human interaction feel more effortful and less satisfying over time
by: Ibrahim, Lujain, et al.
Published: (2026)
by: Ibrahim, Lujain, et al.
Published: (2026)
AI Agents That Matter
by: Kapoor, Sayash, et al.
Published: (2024)
by: Kapoor, Sayash, et al.
Published: (2024)
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas
by: Si, Chenglei, et al.
Published: (2025)
by: Si, Chenglei, et al.
Published: (2025)
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
by: Si, Chenglei, et al.
Published: (2024)
by: Si, Chenglei, et al.
Published: (2024)
Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future
by: Li, Minzhi, et al.
Published: (2024)
by: Li, Minzhi, et al.
Published: (2024)
Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents
by: Li, Miles Q., et al.
Published: (2026)
by: Li, Miles Q., et al.
Published: (2026)
Meet Your New Client: Writing Reports for AI -- Benchmarking Information Loss in Market Research Deliverables
by: Simmering, Paul F., et al.
Published: (2025)
by: Simmering, Paul F., et al.
Published: (2025)
The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems
by: Staufer, Leon, et al.
Published: (2026)
by: Staufer, Leon, et al.
Published: (2026)
Auditing Agent Harness Safety
by: Liu, Chengzhi, et al.
Published: (2026)
by: Liu, Chengzhi, et al.
Published: (2026)
Offloading Score: Measuring AI Reliance Through Counterfactual Workflows
by: Padmakumar, Vishakh, et al.
Published: (2026)
by: Padmakumar, Vishakh, et al.
Published: (2026)
The High Cost of Incivility: Quantifying Interaction Inefficiency via Multi-Agent Monte Carlo Simulations
by: Mangold, Benedikt
Published: (2025)
by: Mangold, Benedikt
Published: (2025)
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
by: Siegel, Zachary S., et al.
Published: (2024)
by: Siegel, Zachary S., et al.
Published: (2024)
No AI After Auschwitz? Bridging AI and Memory Ethics in the Context of Information Retrieval of Genocide-Related Information
by: Makhortykh, Mykola
Published: (2024)
by: Makhortykh, Mykola
Published: (2024)
Questionnaire Responses Do not Capture the Safety of AI Agents
by: Hellrigel-Holderbaum, Max, et al.
Published: (2026)
by: Hellrigel-Holderbaum, Max, et al.
Published: (2026)
AI Safety for Everyone
by: Gyevnar, Balint, et al.
Published: (2025)
by: Gyevnar, Balint, et al.
Published: (2025)
AI Safety in Generative AI Large Language Models: A Survey
by: Chua, Jaymari, et al.
Published: (2024)
by: Chua, Jaymari, et al.
Published: (2024)
Auditing Gender Presentation Differences in Text-to-Image Models
by: Zhang, Yanzhe, et al.
Published: (2023)
by: Zhang, Yanzhe, et al.
Published: (2023)
The Role of AI Safety Institutes in Contributing to International Standards for Frontier AI Safety
by: Fort, Kristina
Published: (2024)
by: Fort, Kristina
Published: (2024)
Safety First: Psychological Safety as the Key to AI Transformation
by: Reich, Aaron, et al.
Published: (2026)
by: Reich, Aaron, et al.
Published: (2026)
AI Safety, Alignment, and Ethics (AI SAE)
by: Waldner, Dylan
Published: (2025)
by: Waldner, Dylan
Published: (2025)
Safety cases for frontier AI
by: Buhl, Marie Davidsen, et al.
Published: (2024)
by: Buhl, Marie Davidsen, et al.
Published: (2024)
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
by: Qiu, Jiahao, et al.
Published: (2025)
by: Qiu, Jiahao, et al.
Published: (2025)
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering
by: Si, Chenglei, et al.
Published: (2024)
by: Si, Chenglei, et al.
Published: (2024)
Verbalizing LLMs' assumptions to explain and control sycophancy
by: Cheng, Myra, et al.
Published: (2026)
by: Cheng, Myra, et al.
Published: (2026)
Opportunities and Risks of Generative AI through the Health Information Journey
by: DeVerna, Matthew R., et al.
Published: (2026)
by: DeVerna, Matthew R., et al.
Published: (2026)
Whose Knowledge Counts? Co-Designing Community-Centered AI Auditing Tools with Educators in Hawai`i
by: Zhao, Dora, et al.
Published: (2026)
by: Zhao, Dora, et al.
Published: (2026)
SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior
by: Li, Jing-Jing, et al.
Published: (2024)
by: Li, Jing-Jing, et al.
Published: (2024)
The BIG Argument for AI Safety Cases
by: Habli, Ibrahim, et al.
Published: (2025)
by: Habli, Ibrahim, et al.
Published: (2025)
Persuasion and Safety in the Era of Generative AI
by: Kong, Haein
Published: (2025)
by: Kong, Haein
Published: (2025)
Similar Items
-
How Should AI Safety Benchmarks Benchmark Safety?
by: Yu, Cheng, et al.
Published: (2026) -
Position: Measure Dataset Diversity, Don't Just Claim It
by: Zhao, Dora, et al.
Published: (2024) -
Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech Generators
by: Hutiri, Wiebke, et al.
Published: (2024) -
Engaged AI Governance: Addressing the Last Mile Challenge Through Internal Expert Collaboration
by: Jarvers, Simon, et al.
Published: (2026) -
AI Adoption Across Mission-Driven Organizations
by: Ali, Dalia, et al.
Published: (2025)