Saved in:
| Main Authors: | Akpinar, Nil-Jana, Avula, Sandeep, Lee, CJ, Dang, Brandon, Razat, Kaza, Murdock, Vanessa |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.15556 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Who's Asking? Evaluating LLM Robustness to Inquiry Personas in Factual Question Answering
by: Akpinar, Nil-Jana, et al.
Published: (2025)
by: Akpinar, Nil-Jana, et al.
Published: (2025)
A Use-Case Specific Dataset for Measuring Dimensions of Responsible Performance in LLM-generated Text
by: Sagae, Alicia, et al.
Published: (2025)
by: Sagae, Alicia, et al.
Published: (2025)
Authenticity and exclusion: social media algorithms and the dynamics of belonging in epistemic communities
by: Akpinar, Nil-Jana, et al.
Published: (2024)
by: Akpinar, Nil-Jana, et al.
Published: (2024)
The Impact of Differential Feature Under-reporting on Algorithmic Fairness
by: Akpinar, Nil-Jana, et al.
Published: (2024)
by: Akpinar, Nil-Jana, et al.
Published: (2024)
Precise Model Benchmarking with Only a Few Observations
by: Fogliato, Riccardo, et al.
Published: (2024)
by: Fogliato, Riccardo, et al.
Published: (2024)
When Neutral Summaries are not that Neutral: Quantifying Political Neutrality in LLM-Generated News Summaries
by: Vijay, Supriti, et al.
Published: (2024)
by: Vijay, Supriti, et al.
Published: (2024)
Misalignment of LLM-Generated Personas with Human Perceptions in Low-Resource Settings
by: Prama, Tabia Tanzin, et al.
Published: (2025)
by: Prama, Tabia Tanzin, et al.
Published: (2025)
Deep Value Benchmark: Measuring Whether Models Generalize Deep Values or Shallow Preferences
by: Ashkinaze, Joshua, et al.
Published: (2025)
by: Ashkinaze, Joshua, et al.
Published: (2025)
When Can We Trust LLM Graders? Calibrating Confidence for Automated Assessment
by: Ferrer, Robinson, et al.
Published: (2026)
by: Ferrer, Robinson, et al.
Published: (2026)
Facts are Harder Than Opinions -- A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability
by: Saju, Lorraine, et al.
Published: (2025)
by: Saju, Lorraine, et al.
Published: (2025)
Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness
by: Alipour, Shayan, et al.
Published: (2024)
by: Alipour, Shayan, et al.
Published: (2024)
Exploring the Human-LLM Synergy in Advancing Theory-driven Qualitative Analysis
by: Meng, Han, et al.
Published: (2024)
by: Meng, Han, et al.
Published: (2024)
Redefining Research Crowdsourcing: Incorporating Human Feedback with LLM-Powered Digital Twins
by: Chan, Amanda, et al.
Published: (2025)
by: Chan, Amanda, et al.
Published: (2025)
Bias in the Tails: How Name-conditioned Evaluative Framing in Resume Summaries Destabilizes LLM-based Hiring
by: Nghiem, Huy, et al.
Published: (2026)
by: Nghiem, Huy, et al.
Published: (2026)
The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks
by: Møller, Anders Giovanni, et al.
Published: (2023)
by: Møller, Anders Giovanni, et al.
Published: (2023)
Reviewing the Reviewer: Elevating Peer Review Quality through LLM-Guided Feedback
by: Purkayastha, Sukannya, et al.
Published: (2026)
by: Purkayastha, Sukannya, et al.
Published: (2026)
When Can We Trust LLMs in Mental Health? Large-Scale Benchmarks for Reliable LLM Evaluation
by: Badawi, Abeer, et al.
Published: (2025)
by: Badawi, Abeer, et al.
Published: (2025)
Human or LLM as Standardized Patients? A Comparative Study for Medical Education
by: Zhang, Bingquan, et al.
Published: (2025)
by: Zhang, Bingquan, et al.
Published: (2025)
Place Matters: Comparing LLM Hallucination Rates for Place-Based Legal Queries
by: Curran, Damian, et al.
Published: (2025)
by: Curran, Damian, et al.
Published: (2025)
Culturally Adaptive Explainable LLM Assessment for Multilingual Information Disorder: A Human-in-the-Loop Approach
by: Jouneghani, Maziar Kianimoghadam
Published: (2026)
by: Jouneghani, Maziar Kianimoghadam
Published: (2026)
Trust, Safety, and Accuracy: Assessing LLMs for Routine Maternity Advice
by: Divya, V Sai, et al.
Published: (2026)
by: Divya, V Sai, et al.
Published: (2026)
Which Type of Students can LLMs Act? Investigating Authentic Simulation with Graph-based Human-AI Collaborative System
by: Li, Haoxuan, et al.
Published: (2025)
by: Li, Haoxuan, et al.
Published: (2025)
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
Whose Personae? Synthetic Persona Experiments in LLM Research and Pathways to Transparency
by: Batzner, Jan, et al.
Published: (2025)
by: Batzner, Jan, et al.
Published: (2025)
"Would You Want an AI Tutor?" Understanding Stakeholder Perceptions of LLM-based Systems in the Classroom
by: Fuligni, Caterina, et al.
Published: (2025)
by: Fuligni, Caterina, et al.
Published: (2025)
Words of Warmth: Trust and Sociability Norms for over 26k English Words
by: Mohammad, Saif M.
Published: (2025)
by: Mohammad, Saif M.
Published: (2025)
Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text
by: Najjar, Ayat, et al.
Published: (2025)
by: Najjar, Ayat, et al.
Published: (2025)
Dropouts in Confidence: Moral Uncertainty in Human-LLM Alignment
by: Kwon, Jea, et al.
Published: (2025)
by: Kwon, Jea, et al.
Published: (2025)
Fair Representation in Parliamentary Summaries: Measuring and Mitigating Inclusion Bias
by: Cunningham, Eoghan, et al.
Published: (2025)
by: Cunningham, Eoghan, et al.
Published: (2025)
Human Preferences for Constructive Interactions in Language Model Alignment
by: Kyrychenko, Yara, et al.
Published: (2025)
by: Kyrychenko, Yara, et al.
Published: (2025)
Users Mispredict Their Own Preferences for AI Writing Assistance
by: Lai, Vivian, et al.
Published: (2026)
by: Lai, Vivian, et al.
Published: (2026)
LLMs as Research Tools: A Large Scale Survey of Researchers' Usage and Perceptions
by: Liao, Zhehui, et al.
Published: (2024)
by: Liao, Zhehui, et al.
Published: (2024)
Training LLM-based Tutors to Improve Student Learning Outcomes in Dialogues
by: Scarlatos, Alexander, et al.
Published: (2025)
by: Scarlatos, Alexander, et al.
Published: (2025)
Generative AI in Higher Education: Seeing ChatGPT Through Universities' Policies, Resources, and Guidelines
by: Wang, Hui, et al.
Published: (2023)
by: Wang, Hui, et al.
Published: (2023)
Real or Robotic? Assessing Whether LLMs Accurately Simulate Qualities of Human Responses in Dialogue
by: Ivey, Jonathan, et al.
Published: (2024)
by: Ivey, Jonathan, et al.
Published: (2024)
Predicting Disagreement with Human Raters in LLM-as-a-Judge Difficulty Assessment without Using Generation-Time Probability Signals
by: Ehara, Yo
Published: (2026)
by: Ehara, Yo
Published: (2026)
Can Large Language Models Unlock Novel Scientific Research Ideas?
by: Kumar, Sandeep, et al.
Published: (2024)
by: Kumar, Sandeep, et al.
Published: (2024)
Hypothesis Testing for Quantifying LLM-Human Misalignment in Multiple Choice Settings
by: Hong, Harbin, et al.
Published: (2025)
by: Hong, Harbin, et al.
Published: (2025)
Unveiling Scoring Processes: Dissecting the Differences between LLMs and Human Graders in Automatic Scoring
by: Wu, Xuansheng, et al.
Published: (2024)
by: Wu, Xuansheng, et al.
Published: (2024)
Building Trust: Foundations of Security, Safety and Transparency in AI
by: Sidhpurwala, Huzaifa, et al.
Published: (2024)
by: Sidhpurwala, Huzaifa, et al.
Published: (2024)
Similar Items
-
Who's Asking? Evaluating LLM Robustness to Inquiry Personas in Factual Question Answering
by: Akpinar, Nil-Jana, et al.
Published: (2025) -
A Use-Case Specific Dataset for Measuring Dimensions of Responsible Performance in LLM-generated Text
by: Sagae, Alicia, et al.
Published: (2025) -
Authenticity and exclusion: social media algorithms and the dynamics of belonging in epistemic communities
by: Akpinar, Nil-Jana, et al.
Published: (2024) -
The Impact of Differential Feature Under-reporting on Algorithmic Fairness
by: Akpinar, Nil-Jana, et al.
Published: (2024) -
Precise Model Benchmarking with Only a Few Observations
by: Fogliato, Riccardo, et al.
Published: (2024)