Saved in:
| Main Author: | Sandrini, Peter |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.23399 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards the Terminator Economy: Assessing Job Exposure to AI through LLMs
by: Colombo, Emilio, et al.
Published: (2024)
by: Colombo, Emilio, et al.
Published: (2024)
PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives
by: Zhang, Zhaowei, et al.
Published: (2025)
by: Zhang, Zhaowei, et al.
Published: (2025)
Self-hosted Lecture-to-Quiz: Local LLM MCQ Generation with Deterministic Quality Control
by: Shintani, Seine A.
Published: (2026)
by: Shintani, Seine A.
Published: (2026)
Aspect-oriented Consumer Health Answer Summarization
by: Chaturvedi, Rochana, et al.
Published: (2024)
by: Chaturvedi, Rochana, et al.
Published: (2024)
People Are Highly Cooperative with Large Language Models, Especially When Communication Is Possible or Following Human Interaction
by: Niszczota, Paweł, et al.
Published: (2025)
by: Niszczota, Paweł, et al.
Published: (2025)
Leveraging Artificial Intelligence as a Strategic Growth Catalyst for Small and Medium-sized Enterprises
by: Agbaakin, Oluwatosin
Published: (2025)
by: Agbaakin, Oluwatosin
Published: (2025)
Chatbot Deployment Considerations for Application-Agnostic Human-Machine Dialogues
by: Rivas, Pablo, et al.
Published: (2025)
by: Rivas, Pablo, et al.
Published: (2025)
Superhuman Game AI Disclosure: Expertise and Context Moderate Effects on Trust and Fairness
by: Chua, Jaymari, et al.
Published: (2025)
by: Chua, Jaymari, et al.
Published: (2025)
Textual Entailment is not a Better Bias Metric than Token Probability
by: Felkner, Virginia K., et al.
Published: (2025)
by: Felkner, Virginia K., et al.
Published: (2025)
GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction
by: Felkner, Virginia K., et al.
Published: (2024)
by: Felkner, Virginia K., et al.
Published: (2024)
ChatGPT and Gemini participated in the Korean College Scholastic Ability Test -- Earth Science I
by: Ga, Seok-Hyun, et al.
Published: (2025)
by: Ga, Seok-Hyun, et al.
Published: (2025)
Benchmarking Bengali Dialectal Bias: A Multi-Stage Framework Integrating RAG-Based Translation and Human-Augmented RLAIF
by: Sami, K. M. Jubair, et al.
Published: (2026)
by: Sami, K. M. Jubair, et al.
Published: (2026)
How Large Language Models Are Changing MOOC Essay Answers: A Comparison of Pre- and Post-LLM Responses
by: Leppänen, Leo, et al.
Published: (2025)
by: Leppänen, Leo, et al.
Published: (2025)
Towards Fairer Health Recommendations: finding informative unbiased samples via Word Sense Disambiguation
by: Butts, Gavin, et al.
Published: (2024)
by: Butts, Gavin, et al.
Published: (2024)
Leveraging Multi-Source Textural UGC for Neighbourhood Housing Quality Assessment: A GPT-Enhanced Framework
by: Hong, Qiyuan, et al.
Published: (2025)
by: Hong, Qiyuan, et al.
Published: (2025)
WSC+: Enhancing The Winograd Schema Challenge Using Tree-of-Experts
by: Zahraei, Pardis Sadat, et al.
Published: (2024)
by: Zahraei, Pardis Sadat, et al.
Published: (2024)
How Utilitarian Are OpenAI's Models Really? Replicating and Reinterpreting Pfeffer, Krügel, and Uhl (2025)
by: Himmelreich, Johannes
Published: (2026)
by: Himmelreich, Johannes
Published: (2026)
On Fact and Frequency: LLM Responses to Misinformation Expressed with Uncertainty
by: van de Sande, Yana, et al.
Published: (2025)
by: van de Sande, Yana, et al.
Published: (2025)
Big Help or Big Brother? Auditing Tracking, Profiling, and Personalization in Generative AI Assistants
by: Vekaria, Yash, et al.
Published: (2025)
by: Vekaria, Yash, et al.
Published: (2025)
The Invisible Coalition Partner: How LLMs Vote When Democracy Gets Concrete
by: Barmettler, Joel
Published: (2026)
by: Barmettler, Joel
Published: (2026)
Implicit Geographic Inference in LLM Medical Triage: Language-Driven Disparities in Emergency Recommendations
by: Wong, Qi Han
Published: (2026)
by: Wong, Qi Han
Published: (2026)
Self-Anchored Attention Model for Sample-Efficient Classification of Prosocial Text Chat
by: Li, Zhuofang, et al.
Published: (2025)
by: Li, Zhuofang, et al.
Published: (2025)
Prosocial Behavior Detection in Player Game Chat: From Aligning Human-AI Definitions to Efficient Annotation at Scale
by: Kocielnik, Rafal, et al.
Published: (2025)
by: Kocielnik, Rafal, et al.
Published: (2025)
AI vs. Human Moderators: A Comparative Evaluation of Multimodal LLMs in Content Moderation for Brand Safety
by: Levi, Adi, et al.
Published: (2025)
by: Levi, Adi, et al.
Published: (2025)
Growing a Tail: Increasing Output Diversity in Large Language Models
by: Shur-Ofry, Michal, et al.
Published: (2024)
by: Shur-Ofry, Michal, et al.
Published: (2024)
Rejected Dialects: Biases Against African American Language in Reward Models
by: Mire, Joel, et al.
Published: (2025)
by: Mire, Joel, et al.
Published: (2025)
Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback
by: Du, Yishan, et al.
Published: (2025)
by: Du, Yishan, et al.
Published: (2025)
Industrialized Deception: The Collateral Effects of LLM-Generated Misinformation on Digital Ecosystems
by: Loth, Alexander, et al.
Published: (2026)
by: Loth, Alexander, et al.
Published: (2026)
The Democratic Paradox in Large Language Models' Underestimation of Press Freedom
by: Loaiza, I., et al.
Published: (2025)
by: Loaiza, I., et al.
Published: (2025)
The Epistemic Suite: A Post-Foundational Diagnostic Methodology for Assessing AI Knowledge Claims
by: Kelly, Matthew
Published: (2025)
by: Kelly, Matthew
Published: (2025)
Understanding Gen Alpha Digital Language: Evaluation of LLM Safety Systems for Content Moderation
by: Mehta, Manisha, et al.
Published: (2025)
by: Mehta, Manisha, et al.
Published: (2025)
Whose wife is it anyway? Assessing bias against same-gender relationships in machine translation
by: Stewart, Ian, et al.
Published: (2024)
by: Stewart, Ian, et al.
Published: (2024)
Assessing Crime Disclosure Patterns in a Large-Scale Cybercrime Forum
by: Hoheisel, Raphael, et al.
Published: (2026)
by: Hoheisel, Raphael, et al.
Published: (2026)
PromptAug: Fine-grained Conflict Classification Using Data Augmentation
by: Warke, Oliver, et al.
Published: (2025)
by: Warke, Oliver, et al.
Published: (2025)
Large language models can replicate cross-cultural differences in personality
by: Niszczota, Paweł, et al.
Published: (2023)
by: Niszczota, Paweł, et al.
Published: (2023)
Grandes modelos de lenguaje: de la predicción de palabras a la comprensión?
by: Gómez-Rodríguez, Carlos
Published: (2025)
by: Gómez-Rodríguez, Carlos
Published: (2025)
Killer Apps: Low-Speed, Large-Scale AI Weapons
by: Feldman, Philip, et al.
Published: (2024)
by: Feldman, Philip, et al.
Published: (2024)
Efficacy of a Computer Tutor that Models Expert Human Tutors
by: Olney, Andrew M., et al.
Published: (2025)
by: Olney, Andrew M., et al.
Published: (2025)
Bye Bye Perspective API: Lessons for Measurement Infrastructure in NLP, CSS and LLM Evaluation
by: Hartmann, David, et al.
Published: (2026)
by: Hartmann, David, et al.
Published: (2026)
ParliaBench: An Evaluation and Benchmarking Framework for LLM-Generated Parliamentary Speech
by: Koniaris, Marios, et al.
Published: (2025)
by: Koniaris, Marios, et al.
Published: (2025)
Similar Items
-
Towards the Terminator Economy: Assessing Job Exposure to AI through LLMs
by: Colombo, Emilio, et al.
Published: (2024) -
PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives
by: Zhang, Zhaowei, et al.
Published: (2025) -
Self-hosted Lecture-to-Quiz: Local LLM MCQ Generation with Deterministic Quality Control
by: Shintani, Seine A.
Published: (2026) -
Aspect-oriented Consumer Health Answer Summarization
by: Chaturvedi, Rochana, et al.
Published: (2024) -
People Are Highly Cooperative with Large Language Models, Especially When Communication Is Possible or Following Human Interaction
by: Niszczota, Paweł, et al.
Published: (2025)