Saved in:
| Main Authors: | Arnaiz-Rodriguez, Adrian, Baidal, Miguel, Derner, Erik, Annable, Jenn Layton, Ball, Mark, Ince, Mark, Vallejos, Elvira Perez, Oliver, Nuria |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.24857 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Large Reasoning Models Are Autonomous Jailbreak Agents
by: Hagendorff, Thilo, et al.
Published: (2025)
by: Hagendorff, Thilo, et al.
Published: (2025)
Can ChatGPT Read Who You Are?
by: Derner, Erik, et al.
Published: (2023)
by: Derner, Erik, et al.
Published: (2023)
Mind the Style: Impact of Communication Style on Human-Chatbot Interaction
by: Derner, Erik, et al.
Published: (2026)
by: Derner, Erik, et al.
Published: (2026)
Towards Algorithmic Fairness by means of Instance-level Data Re-weighting based on Shapley Values
by: Arnaiz-Rodriguez, Adrian, et al.
Published: (2023)
by: Arnaiz-Rodriguez, Adrian, et al.
Published: (2023)
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
by: Derner, Erik, et al.
Published: (2024)
by: Derner, Erik, et al.
Published: (2024)
From "Help" to Helpful: A Hierarchical Assessment of LLMs in Mental e-Health Applications
by: Steigerwald, Philipp, et al.
Published: (2026)
by: Steigerwald, Philipp, et al.
Published: (2026)
Seeking Help, Facing Harm: Auditing TikTok's Mental Health Recommendations
by: Jamie, Pooriya, et al.
Published: (2026)
by: Jamie, Pooriya, et al.
Published: (2026)
A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models
by: Derner, Erik, et al.
Published: (2023)
by: Derner, Erik, et al.
Published: (2023)
Towards Human-AI Complementarity in Matching Tasks
by: Arnaiz-Rodriguez, Adrian, et al.
Published: (2025)
by: Arnaiz-Rodriguez, Adrian, et al.
Published: (2025)
Aesthetics as Structural Harm: Algorithmic Lookism Across Text-to-Image Generation and Classification
by: Doh, Miriam, et al.
Published: (2026)
by: Doh, Miriam, et al.
Published: (2026)
Understanding Remote Mental Health Supporters' Help-Seeking in Online Communities
by: Lee, Tuan-He, et al.
Published: (2026)
by: Lee, Tuan-He, et al.
Published: (2026)
From Risk Avoidance to User Empowerment in AI Mental Health Crisis Support
by: Kaveladze, Benjamin, et al.
Published: (2026)
by: Kaveladze, Benjamin, et al.
Published: (2026)
Survival at Any Cost? LLMs and the Choice Between Self-Preservation and Human Harm
by: Mohamadi, Alireza, et al.
Published: (2025)
by: Mohamadi, Alireza, et al.
Published: (2025)
We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong
by: Kashyap, Gautam Siddharth, et al.
Published: (2025)
by: Kashyap, Gautam Siddharth, et al.
Published: (2025)
Structural Group Unfairness: Measurement and Mitigation by means of the Effective Resistance
by: Arnaiz-Rodriguez, Adrian, et al.
Published: (2023)
by: Arnaiz-Rodriguez, Adrian, et al.
Published: (2023)
AI Chatbots for Mental Health: Values and Harms from Lived Experiences of Depression
by: Yoo, Dong Whi, et al.
Published: (2025)
by: Yoo, Dong Whi, et al.
Published: (2025)
The Thin Line Between Comprehension and Persuasion in LLMs
by: de Wynter, Adrian, et al.
Published: (2025)
by: de Wynter, Adrian, et al.
Published: (2025)
Seductive Details Behind Hyperlinks—Harmful or Helpful for Learning?
by: Lisa Bender, et al.
Published: (2026)
by: Lisa Bender, et al.
Published: (2026)
Should LLM Safety Be More Than Refusing Harmful Instructions?
by: Maskey, Utsav, et al.
Published: (2025)
by: Maskey, Utsav, et al.
Published: (2025)
Considering Avatar Crossing as Harm or Help for Adolescents in Social VR
by: Bailey, Jakki O., et al.
Published: (2024)
by: Bailey, Jakki O., et al.
Published: (2024)
Seeking Late Night Life Lines: Experiences of Conversational AI Use in Mental Health Crisis
by: Ajmani, Leah Hope, et al.
Published: (2025)
by: Ajmani, Leah Hope, et al.
Published: (2025)
Telling Speculative Stories to Help Humans Imagine the Harms of Healthcare AI
by: Zhao, Xingmeng, et al.
Published: (2025)
by: Zhao, Xingmeng, et al.
Published: (2025)
Helping Johnny Make Sense of Privacy Policies with LLMs
by: Freiberger, Vincent, et al.
Published: (2025)
by: Freiberger, Vincent, et al.
Published: (2025)
Leveraging LLMs for Translating and Classifying Mental Health Data
by: Skianis, Konstantinos, et al.
Published: (2024)
by: Skianis, Konstantinos, et al.
Published: (2024)
M-HELP: Using Social Media Data to Detect Mental Health Help-Seeking Signals
by: Sathvik, MSVPJ, et al.
Published: (2025)
by: Sathvik, MSVPJ, et al.
Published: (2025)
Engagement-Optimized Care: When LLMs become Mental Health Infrastructure
by: Vecchione, Briana, et al.
Published: (2026)
by: Vecchione, Briana, et al.
Published: (2026)
Beyond Content Exposure: Systemic Factors Driving Moderators' Mental Health Crisis in Africa
by: Abdelkadir, Nuredin Ali, et al.
Published: (2026)
by: Abdelkadir, Nuredin Ali, et al.
Published: (2026)
The Bias of Harmful Label Associations in Vision-Language Models
by: Hazirbas, Caner, et al.
Published: (2024)
by: Hazirbas, Caner, et al.
Published: (2024)
From Harm to Help: Turning Reasoning In-Context Demos into Assets for Reasoning LMs
by: Wang, Haonan, et al.
Published: (2025)
by: Wang, Haonan, et al.
Published: (2025)
Can Editing LLMs Inject Harm?
by: Chen, Canyu, et al.
Published: (2024)
by: Chen, Canyu, et al.
Published: (2024)
LLMs Encode Harmfulness and Refusal Separately
by: Zhao, Jiachen, et al.
Published: (2025)
by: Zhao, Jiachen, et al.
Published: (2025)
Recipes for Pre-training LLMs with MXFP8
by: Mishra, Asit, et al.
Published: (2025)
by: Mishra, Asit, et al.
Published: (2025)
Leveraging Small LLMs for Argument Mining in Education: Argument Component Identification, Classification, and Assessment
by: Favero, Lucile, et al.
Published: (2025)
by: Favero, Lucile, et al.
Published: (2025)
Understanding and Facilitating Mental Health Help-Seeking of Young Adults: A Socio-technical Ecosystem Framework
by: Liu, Jiaying, et al.
Published: (2024)
by: Liu, Jiaying, et al.
Published: (2024)
Too Helpful, Too Harmless, Too Honest or Just Right?
by: Kashyap, Gautam Siddharth, et al.
Published: (2025)
by: Kashyap, Gautam Siddharth, et al.
Published: (2025)
On the Sensitivity of Instruction-tuned LLMs to Harmful Sentences in Long Inputs
by: Ghorbanpour, Faeze, et al.
Published: (2025)
by: Ghorbanpour, Faeze, et al.
Published: (2025)
Can LLMs Rank the Harmfulness of Smaller LLMs? We are Not There Yet
by: Atil, Berk, et al.
Published: (2025)
by: Atil, Berk, et al.
Published: (2025)
Large Language Models as Students Who Think Aloud: Overly Coherent, Verbose, and Confident
by: Borchers, Conrad, et al.
Published: (2026)
by: Borchers, Conrad, et al.
Published: (2026)
Automated Pest Counting in Water Traps through Active Robotic Stirring for Occlusion Handling
by: Gao, Xumin, et al.
Published: (2025)
by: Gao, Xumin, et al.
Published: (2025)
The Disparate Benefits of Deep Ensembles
by: Schweighofer, Kajetan, et al.
Published: (2024)
by: Schweighofer, Kajetan, et al.
Published: (2024)
Similar Items
-
Large Reasoning Models Are Autonomous Jailbreak Agents
by: Hagendorff, Thilo, et al.
Published: (2025) -
Can ChatGPT Read Who You Are?
by: Derner, Erik, et al.
Published: (2023) -
Mind the Style: Impact of Communication Style on Human-Chatbot Interaction
by: Derner, Erik, et al.
Published: (2026) -
Towards Algorithmic Fairness by means of Instance-level Data Re-weighting based on Shapley Values
by: Arnaiz-Rodriguez, Adrian, et al.
Published: (2023) -
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
by: Derner, Erik, et al.
Published: (2024)