Saved in:
| Main Authors: | Ryser, Adrian, Allwein, Florian, Schlippe, Tim |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.09088 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RubiSCoT: A Framework for AI-Supported Academic Assessment
by: Fröhlich, Thorsten, et al.
Published: (2025)
by: Fröhlich, Thorsten, et al.
Published: (2025)
A Cross-Cultural Assessment of Human Ability to Detect LLM-Generated Fake News about South Africa
by: Schlippe, Tim, et al.
Published: (2025)
by: Schlippe, Tim, et al.
Published: (2025)
A User-Centric Analysis of Explainability in AI-Based Medical Image Diagnosis
by: Wagner, Julia, et al.
Published: (2026)
by: Wagner, Julia, et al.
Published: (2026)
Exploring ChatGPT's Empathic Abilities
by: Schaaff, Kristina, et al.
Published: (2023)
by: Schaaff, Kristina, et al.
Published: (2023)
Classification of Human- and AI-Generated Texts for English, French, German, and Spanish
by: Schaaff, Kristina, et al.
Published: (2023)
by: Schaaff, Kristina, et al.
Published: (2023)
Language-Independent Sentiment Labelling with Distant Supervision: A Case Study for English, Sepedi and Setswana
by: Mabokela, Koena Ronny, et al.
Published: (2025)
by: Mabokela, Koena Ronny, et al.
Published: (2025)
Large Language Models for Sentiment Analysis to Detect Social Challenges: A Use Case with South African Languages
by: Mabokela, Koena Ronny, et al.
Published: (2025)
by: Mabokela, Koena Ronny, et al.
Published: (2025)
Evaluating Retrieval-Augmented Generation Variants for Natural Language-Based SQL and API Call Generation
by: Marketsmüller, Michael, et al.
Published: (2026)
by: Marketsmüller, Michael, et al.
Published: (2026)
Deep Learning-Based Anomaly Detection in Spacecraft Telemetry on Edge Devices
by: Goetze, Christopher, et al.
Published: (2026)
by: Goetze, Christopher, et al.
Published: (2026)
Assessing Consciousness-Related Behaviors in Large Language Models Using the Maze Test
by: Pimenta, Rui A., et al.
Published: (2025)
by: Pimenta, Rui A., et al.
Published: (2025)
Mitigating LLM Hallucination via Behaviorally Calibrated Reinforcement Learning
by: Wu, Jiayun, et al.
Published: (2025)
by: Wu, Jiayun, et al.
Published: (2025)
Benchmarking NLP-supported Language Sample Analysis for Swiss Children's Speech
by: Ryser, Anja, et al.
Published: (2025)
by: Ryser, Anja, et al.
Published: (2025)
Exploring Trust Calibration in XAI - The Impact of Exposing Model Limitations to Lay Users
by: Ventura, Alfio, et al.
Published: (2026)
by: Ventura, Alfio, et al.
Published: (2026)
Calibrated Language Models Must Hallucinate
by: Kalai, Adam Tauman, et al.
Published: (2023)
by: Kalai, Adam Tauman, et al.
Published: (2023)
Citations and Trust in LLM Generated Responses
by: Ding, Yifan, et al.
Published: (2025)
by: Ding, Yifan, et al.
Published: (2025)
Uncertainty Awareness and Trust in Explainable AI- On Trust Calibration using Local and Global Explanations
by: Newen, Carina, et al.
Published: (2025)
by: Newen, Carina, et al.
Published: (2025)
Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents
by: Karge, Jonas
Published: (2026)
by: Karge, Jonas
Published: (2026)
Mitigating LLM Hallucinations with Knowledge Graphs: A Case Study
by: Li, Harry, et al.
Published: (2025)
by: Li, Harry, et al.
Published: (2025)
Two Is Better Than One: Aligned Representation Pairs for Anomaly Detection
by: Ryser, Alain, et al.
Published: (2024)
by: Ryser, Alain, et al.
Published: (2024)
To Trust or Not to Trust: On Calibration in ML-based Resource Allocation for Wireless Networks
by: Raina, Rashika, et al.
Published: (2025)
by: Raina, Rashika, et al.
Published: (2025)
Hallucination in LLM-Based Code Generation: An Automotive Case Study
by: Pavel, Marc, et al.
Published: (2025)
by: Pavel, Marc, et al.
Published: (2025)
Can You Trust an LLM with Your Life-Changing Decision? An Investigation into AI High-Stakes Responses
by: Cahyono, Joshua Adrian, et al.
Published: (2025)
by: Cahyono, Joshua Adrian, et al.
Published: (2025)
Cross-Modal Attention Calibration for LVLM Hallucination Mitigation
by: Li, Jiaming, et al.
Published: (2025)
by: Li, Jiaming, et al.
Published: (2025)
Evaluating Human Trust in LLM-Based Planners: A Preliminary Study
by: Chen, Shenghui, et al.
Published: (2025)
by: Chen, Shenghui, et al.
Published: (2025)
Hallucination Basins: A Dynamic Framework for Understanding and Controlling LLM Hallucinations
by: Cherukuri, Kalyan, et al.
Published: (2026)
by: Cherukuri, Kalyan, et al.
Published: (2026)
Algorithmically Establishing Trust in Evaluators
by: de Wynter, Adrian
Published: (2025)
by: de Wynter, Adrian
Published: (2025)
Dynamic Trust Calibration Using Contextual Bandits
by: Henrique, Bruno M., et al.
Published: (2025)
by: Henrique, Bruno M., et al.
Published: (2025)
TERMS-Bench: Diagnosing LLM Negotiation Agents Beyond Deal Rate
by: Zhang, Erica, et al.
Published: (2026)
by: Zhang, Erica, et al.
Published: (2026)
HalluLens: LLM Hallucination Benchmark
by: Bang, Yejin, et al.
Published: (2025)
by: Bang, Yejin, et al.
Published: (2025)
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
by: Luo, Jinyuan, et al.
Published: (2025)
by: Luo, Jinyuan, et al.
Published: (2025)
Dealing with Inconsistency for Reasoning over Knowledge Graphs: A Survey
by: Nentidis, Anastasios, et al.
Published: (2025)
by: Nentidis, Anastasios, et al.
Published: (2025)
Learn to Code Sustainably: An Empirical Study on LLM-based Green Code Generation
by: Vartziotis, Tina, et al.
Published: (2024)
by: Vartziotis, Tina, et al.
Published: (2024)
HALT-RAG: A Task-Adaptable Framework for Hallucination Detection with Calibrated NLI Ensembles and Abstention
by: Goswami, Saumya, et al.
Published: (2025)
by: Goswami, Saumya, et al.
Published: (2025)
Banishing LLM Hallucinations Requires Rethinking Generalization
by: Li, Johnny, et al.
Published: (2024)
by: Li, Johnny, et al.
Published: (2024)
MIRAGE-Bench: LLM Agent is Hallucinating and Where to Find Them
by: Zhang, Weichen, et al.
Published: (2025)
by: Zhang, Weichen, et al.
Published: (2025)
Can We Trust LLM Detectors?
by: Sandhan, Jivnesh, et al.
Published: (2026)
by: Sandhan, Jivnesh, et al.
Published: (2026)
CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs
by: Huang, Xiaoyi, et al.
Published: (2026)
by: Huang, Xiaoyi, et al.
Published: (2026)
How Much Do LLMs Hallucinate across Languages? On Realistic Multilingual Estimation of LLM Hallucination
by: Islam, Saad Obaid ul, et al.
Published: (2025)
by: Islam, Saad Obaid ul, et al.
Published: (2025)
Dealing with Uncertainty in Contextual Anomaly Detection
by: Bindini, Luca, et al.
Published: (2025)
by: Bindini, Luca, et al.
Published: (2025)
LLM-REVal: Can We Trust LLM Reviewers Yet?
by: Li, Rui, et al.
Published: (2025)
by: Li, Rui, et al.
Published: (2025)
Similar Items
-
RubiSCoT: A Framework for AI-Supported Academic Assessment
by: Fröhlich, Thorsten, et al.
Published: (2025) -
A Cross-Cultural Assessment of Human Ability to Detect LLM-Generated Fake News about South Africa
by: Schlippe, Tim, et al.
Published: (2025) -
A User-Centric Analysis of Explainability in AI-Based Medical Image Diagnosis
by: Wagner, Julia, et al.
Published: (2026) -
Exploring ChatGPT's Empathic Abilities
by: Schaaff, Kristina, et al.
Published: (2023) -
Classification of Human- and AI-Generated Texts for English, French, German, and Spanish
by: Schaaff, Kristina, et al.
Published: (2023)