:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Loweimi, Erfan, Garcia, Sofia de la Fuente, Loveymi, Samira, Daneshvar, Hadi, Luz, Saturnino
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.09634
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Predicting Psychological Well-Being from Spontaneous Speech using LLMs
by: Loweimi, Erfan, et al.
Published: (2026)

When Can We Trust LLMs in Mental Health? Large-Scale Benchmarks for Reliable LLM Evaluation
by: Badawi, Abeer, et al.
Published: (2025)

An interpretable speech foundation model for depression detection by revealing prediction-relevant acoustic features from long speech
by: Deng, Qingkun, et al.
Published: (2024)

RCT: Random Consistency Training for Semi-supervised Sound Event Detection
by: Shao, Nian, et al.
Published: (2021)

Faithful Autoformalization via Roundtrip Verification and Repair
by: Amrollahi, Daneshvar, et al.
Published: (2026)

Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs
by: Loweimi, Erfan, et al.
Published: (2024)

Can We Trust LLMs? Mitigate Overconfidence Bias in LLMs through Knowledge Transfer
by: Yang, Haoyan, et al.
Published: (2024)

Evaluating GRPO and DPO for Faithful Chain-of-Thought Reasoning in LLMs
by: Mohammadi, Hadi, et al.
Published: (2025)

Can We Trust LLM Detectors?
by: Sandhan, Jivnesh, et al.
Published: (2026)

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
by: Wang, Leyao, et al.
Published: (2026)

Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness
by: Loweimi, Erfan, et al.
Published: (2025)

Can We Trust LLMs on Memristors? Diving into Reasoning Ability under Non-Ideality
by: Wu, Taiqiang, et al.
Published: (2026)

Zero-shot Audio Topic Reranking using Large Language Models
by: Qian, Mengjie, et al.
Published: (2023)

LLM-REVal: Can We Trust LLM Reviewers Yet?
by: Li, Rui, et al.
Published: (2025)

Can LLMs Rank the Harmfulness of Smaller LLMs? We are Not There Yet
by: Atil, Berk, et al.
Published: (2025)

Can We Locate and Prevent Stereotypes in LLMs?
by: D'Souza, Alex
Published: (2026)

Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate
by: Kim, Kyungha, et al.
Published: (2024)

When Can We Trust LLM Graders? Calibrating Confidence for Automated Assessment
by: Ferrer, Robinson, et al.
Published: (2026)

Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?
by: He, Jianfeng, et al.
Published: (2024)

Probing Whisper for Dysarthric Speech in Detection and Assessment
by: Yue, Zhengjun, et al.
Published: (2025)

Faithful Summarization of Consumer Health Queries: A Cross-Lingual Framework with LLMs
by: Abrar, Ajwad, et al.
Published: (2025)

LLMs Can Plan Only If We Tell Them
by: Sel, Bilgehan, et al.
Published: (2025)

To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts
by: Huang, Yukun, et al.
Published: (2024)

Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
by: Schimanski, Tobias, et al.
Published: (2024)

Bench-2-CoP: Can We Trust Benchmarking for EU AI Compliance?
by: Prandi, Matteo, et al.
Published: (2025)

Connected Speech-Based Cognitive Assessment in Chinese and English
by: Luz, Saturnino, et al.
Published: (2024)

MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
by: Liu, Gabrielle Kaili-May, et al.
Published: (2025)

Leveraging LLMs for Translating and Classifying Mental Health Data
by: Skianis, Konstantinos, et al.
Published: (2024)

On the Robust Approximation of ASR Metrics
by: Waheed, Abdul, et al.
Published: (2025)

Can We Edit LLMs for Long-Tail Biomedical Knowledge?
by: Yi, Xinhao, et al.
Published: (2025)

LongFaith: Enhancing Long-Context Reasoning in LLMs with Faithful Synthetic Data
by: Yang, Cehao, et al.
Published: (2025)

The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders?
by: Adedeji, Ayo, et al.
Published: (2025)

WER We Stand: Benchmarking Urdu ASR Models
by: Arif, Samee, et al.
Published: (2024)

Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models
by: Soni, Nikita, et al.
Published: (2026)

We Care: Multimodal Depression Detection and Knowledge Infused Mental Health Therapeutic Response Generation
by: Moon, Palash, et al.
Published: (2024)

Can LLMs Faithfully Explain Themselves in Low-Resource Languages? A Case Study on Emotion Detection in Persian
by: Mehrazar, Mobina, et al.
Published: (2025)

Can LLMs Write Faithfully? An Agent-Based Evaluation of LLM-generated Islamic Content
by: Mushtaq, Abdullah, et al.
Published: (2025)

Dissociation of Faithful and Unfaithful Reasoning in LLMs
by: Yee, Evelyn, et al.
Published: (2024)

Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness
by: Bi, Baolong, et al.
Published: (2024)

Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing
by: Zhao, Raoyuan, et al.
Published: (2025)