:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Divya, V Sai, Bhanusree, A, Rimjhim, Rao, K Venkata Krishna
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Computers and Society
Online Access:	https://arxiv.org/abs/2603.16872
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Comparative Analysis of Large Language Models in Generating Telugu Responses for Maternal Health Queries
by: Bhanusree, Anagani, et al.
Published: (2026)

Recognition Without Authorization: LLMs and the Moral Order of Online Advice
by: van Nuenen, Tom
Published: (2026)

SafeMath: Inference-time Safety improves Math Accuracy
by: Basu, Sagnik, et al.
Published: (2026)

Building Trust: Foundations of Security, Safety and Transparency in AI
by: Sidhpurwala, Huzaifa, et al.
Published: (2024)

Beyond Accuracy: Diagnosing Algebraic Reasoning Failures in LLMs Across Nine Complexity Dimensions
by: Patil, Parth, et al.
Published: (2026)

Ensuring Safety and Trust: Analyzing the Risks of Large Language Models in Medicine
by: Yang, Yifan, et al.
Published: (2024)

The Reliability of LLMs for Medical Diagnosis: An Examination of Consistency, Manipulation, and Contextual Awareness
by: Subedi, Krishna
Published: (2025)

Help! Need Advice on Identifying Advice
by: Govindarajan, Venkata Subrahmanyan, et al.
Published: (2020)

Assessing the Performance of Human-Capable LLMs -- Are LLMs Coming for Your Job?
by: Mavi, John, et al.
Published: (2024)

Assessing the Impact of Conspiracy Theories Using Large Language Models
by: Jiang, Bohan, et al.
Published: (2024)

When AI Speaks, Whose Values Does It Express? A Cross-Cultural Audit of Individualism-Collectivism Bias in Large Language Models
by: Venkata, Pruthvinath Jeripity
Published: (2026)

Leveraging LLMs to Assess Tutor Moves in Real-Life Dialogues: A Feasibility Study
by: Thomas, Danielle R., et al.
Published: (2025)

Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD
by: Tan, Bryan Chen Zhengyu, et al.
Published: (2025)

Are LLMs Court-Ready? Evaluating Frontier Models on Indian Legal Reasoning
by: Juvekar, Kush, et al.
Published: (2025)

Can LLMs Reason About Trust?: A Pilot Study
by: Debnath, Anushka, et al.
Published: (2025)

Exploring Safety Alignment Evaluation of LLMs in Chinese Mental Health Dialogues via LLM-as-Judge
by: Cai, Yunna, et al.
Published: (2025)

Assessing LLMs in Art Contexts: Critique Generation and Theory of Mind Evaluation
by: Arita, Takaya, et al.
Published: (2025)

The Homogenization Problem in LLMs: Towards Meaningful Diversity in AI Safety
by: Rios-Sialer, Ian
Published: (2026)

When Can We Trust LLMs in Mental Health? Large-Scale Benchmarks for Reliable LLM Evaluation
by: Badawi, Abeer, et al.
Published: (2025)

ZPD-SCA: Unveiling the Blind Spots of LLMs in Assessing Students' Cognitive Abilities
by: Dong, Wenhan, et al.
Published: (2025)

Counterfactual Probing for the Influence of Affect and Specificity on Intergroup Bias
by: Govindarajan, Venkata S, et al.
Published: (2023)

Evaluating Prompt Engineering Techniques for Accuracy and Confidence Elicitation in Medical LLMs
by: Naderi, Nariman, et al.
Published: (2025)

When Do LLMs Generate Realistic Social Networks? A Multi-Dimensional Study of Culture, Language, Scale, and Method
by: Kilaru, Sai Hemanth, et al.
Published: (2026)

Real or Robotic? Assessing Whether LLMs Accurately Simulate Qualities of Human Responses in Dialogue
by: Ivey, Jonathan, et al.
Published: (2024)

LLM or Human? Perceptions of Trust and Information Quality in Research Summaries
by: Akpinar, Nil-Jana, et al.
Published: (2026)

Different Bias Under Different Criteria: Assessing Bias in LLMs with a Fact-Based Approach
by: Ko, Changgeon, et al.
Published: (2024)

SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior
by: Li, Jing-Jing, et al.
Published: (2024)

The Statistical Signature of LLMs
by: Hadad, Ortal, et al.
Published: (2026)

LLMs left, right, and center: Assessing GPT's capabilities to label political bias from web domains
by: Hernandes, Raphael, et al.
Published: (2024)

CAIRNS: Balancing Readability and Scientific Accuracy in Climate Adaptation Question Answering
by: Kong, Liangji, et al.
Published: (2025)

Passing the Turing Test in Political Discourse: Fine-Tuning LLMs to Mimic Polarized Social Media Comments
by: Pazzaglia, ., et al.
Published: (2025)

When Can We Trust LLM Graders? Calibrating Confidence for Automated Assessment
by: Ferrer, Robinson, et al.
Published: (2026)

Words of Warmth: Trust and Sociability Norms for over 26k English Words
by: Mohammad, Saif M.
Published: (2025)

Expected Harm: Rethinking Safety Evaluation of (Mis)Aligned LLMs
by: Chen, Yen-Shan, et al.
Published: (2026)

Auditing Agent Harness Safety
by: Liu, Chengzhi, et al.
Published: (2026)

Unfair TOS: An Automated Approach using Customized BERT
by: Akash, Bathini Sai, et al.
Published: (2024)

Medical Malice: A Dataset for Context-Aware Safety in Healthcare LLMs
by: D'addario, Andrew Maranhão Ventura
Published: (2025)

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
by: Tedeschi, Simone, et al.
Published: (2024)

Paradox of De-identification: A Critique of HIPAA Safe Harbour in the Age of LLMs
by: Jiang, Lavender Y., et al.
Published: (2026)

EthicsMH: A Pilot Benchmark for Ethical Reasoning in Mental Health AI
by: Kasu, Sai Kartheek Reddy
Published: (2025)