Saved in:
| Main Authors: | Gladstone, Clovis, Fang, Zhao, Stewart, Spencer Dean |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.14688 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950
by: Fang, Zhao, et al.
Published: (2025)
by: Fang, Zhao, et al.
Published: (2025)
CulturALL: Benchmarking Multilingual and Multicultural Competence of LLMs on Grounded Tasks
by: Lin, Peiqin, et al.
Published: (2026)
by: Lin, Peiqin, et al.
Published: (2026)
Tokenization and Representation Biases in Multilingual Models on Dialectal NLP Tasks
by: Kanjirangat, Vani, et al.
Published: (2025)
by: Kanjirangat, Vani, et al.
Published: (2025)
Collective Reasoning Among LLMs: A Framework for Answer Validation Without Ground Truth
by: Davoudi, Seyed Pouyan Mousavi, et al.
Published: (2025)
by: Davoudi, Seyed Pouyan Mousavi, et al.
Published: (2025)
Human-Centric NLP or AI-Centric Illusion?: A Critical Investigation
by: Spencer, Piyapath T
Published: (2024)
by: Spencer, Piyapath T
Published: (2024)
Testing the Limits of Truth Directions in LLMs
by: Poulis, Angelos, et al.
Published: (2026)
by: Poulis, Angelos, et al.
Published: (2026)
How Context Shapes Truth: Geometric Transformations of Statement-level Truth Representations in LLMs
by: Adarsh, Shivam, et al.
Published: (2026)
by: Adarsh, Shivam, et al.
Published: (2026)
MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
by: Fabbri, Alexander R., et al.
Published: (2025)
by: Fabbri, Alexander R., et al.
Published: (2025)
AggTruth: Contextual Hallucination Detection using Aggregated Attention Scores in LLMs
by: Matys, Piotr, et al.
Published: (2025)
by: Matys, Piotr, et al.
Published: (2025)
Comparative Performance of Advanced NLP Models and LLMs in Multilingual Geo-Entity Detection
by: Kopanov, Kalin
Published: (2024)
by: Kopanov, Kalin
Published: (2024)
Truth is Universal: Robust Detection of Lies in LLMs
by: Bürger, Lennart, et al.
Published: (2024)
by: Bürger, Lennart, et al.
Published: (2024)
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
by: Wei, Zhepei, et al.
Published: (2025)
by: Wei, Zhepei, et al.
Published: (2025)
Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs
by: Rei, Ricardo, et al.
Published: (2025)
by: Rei, Ricardo, et al.
Published: (2025)
On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?
by: Choenni, Rochelle, et al.
Published: (2024)
by: Choenni, Rochelle, et al.
Published: (2024)
GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
by: Gyamfi, Lawrence Adu, et al.
Published: (2026)
by: Gyamfi, Lawrence Adu, et al.
Published: (2026)
Advancing NLP Security by Leveraging LLMs as Adversarial Engines
by: Srinivasan, Sudarshan, et al.
Published: (2024)
by: Srinivasan, Sudarshan, et al.
Published: (2024)
Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs
by: Cattaneo, Alberto, et al.
Published: (2025)
by: Cattaneo, Alberto, et al.
Published: (2025)
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
by: Karia, Rushang, et al.
Published: (2024)
by: Karia, Rushang, et al.
Published: (2024)
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
by: Calderon, Nitay, et al.
Published: (2024)
by: Calderon, Nitay, et al.
Published: (2024)
Graphing the Truth: Structured Visualizations for Automated Hallucination Detection in LLMs
by: Agrawal, Tanmay
Published: (2025)
by: Agrawal, Tanmay
Published: (2025)
Reasoning Isn't Enough: Examining Truth-Bias and Sycophancy in LLMs
by: Barkett, Emilio, et al.
Published: (2025)
by: Barkett, Emilio, et al.
Published: (2025)
Debating with More Persuasive LLMs Leads to More Truthful Answers
by: Khan, Akbir, et al.
Published: (2024)
by: Khan, Akbir, et al.
Published: (2024)
Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?
by: Chatrath, Veronica, et al.
Published: (2024)
by: Chatrath, Veronica, et al.
Published: (2024)
Ranking Large Language Models without Ground Truth
by: Dhurandhar, Amit, et al.
Published: (2024)
by: Dhurandhar, Amit, et al.
Published: (2024)
The Consensus Trap: Dissecting Subjectivity and the "Ground Truth" Illusion in Data Annotation
by: Munir, Sheza, et al.
Published: (2026)
by: Munir, Sheza, et al.
Published: (2026)
Multilingual LLMs Are Not Multilingual Thinkers: Evidence from Hindi Analogy Evaluation
by: Gupta, Ashray, et al.
Published: (2025)
by: Gupta, Ashray, et al.
Published: (2025)
From Transformers to LLMs: A Systematic Survey of Efficiency Considerations in NLP
by: Ansar, Wazib, et al.
Published: (2024)
by: Ansar, Wazib, et al.
Published: (2024)
Beyond Catalogue Counts: the Dataset Visibility Asymmetry in Low-Resource Multilingual NLP
by: Tan, Zhiyin, et al.
Published: (2026)
by: Tan, Zhiyin, et al.
Published: (2026)
Evaluating Deduplication Techniques for Economic Research Paper Titles with a Focus on Semantic Similarity using NLP and LLMs
by: You, Doohee, et al.
Published: (2024)
by: You, Doohee, et al.
Published: (2024)
Single Ground Truth Is Not Enough: Adding Flexibility to Aspect-Based Sentiment Analysis Evaluation
by: Yang, Soyoung, et al.
Published: (2024)
by: Yang, Soyoung, et al.
Published: (2024)
Multilingual Prompt Engineering in Large Language Models: A Survey Across NLP Tasks
by: Vatsal, Shubham, et al.
Published: (2025)
by: Vatsal, Shubham, et al.
Published: (2025)
The Riddle of Reflection: Evaluating Reasoning and Self-Awareness in Multilingual LLMs using Indian Riddles
by: M, Abhinav P, et al.
Published: (2025)
by: M, Abhinav P, et al.
Published: (2025)
A Framework to Assess Multilingual Vulnerabilities of LLMs
by: Tang, Likai, et al.
Published: (2025)
by: Tang, Likai, et al.
Published: (2025)
TruthStance: An Annotated Dataset of Conversations on Truth Social
by: Ameen, Fathima, et al.
Published: (2026)
by: Ameen, Fathima, et al.
Published: (2026)
MASE: Interpretable NLP Models via Model-Agnostic Saliency Estimation
by: Yang, Zhou, et al.
Published: (2025)
by: Yang, Zhou, et al.
Published: (2025)
TruthFlow: Truthful LLM Generation via Representation Flow Correction
by: Wang, Hanyu, et al.
Published: (2025)
by: Wang, Hanyu, et al.
Published: (2025)
Intertwining CP and NLP: The Generation of Unreasonably Constrained Sentences
by: Bonlarron, Alexandre, et al.
Published: (2024)
by: Bonlarron, Alexandre, et al.
Published: (2024)
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
by: Rystrøm, Jonathan, et al.
Published: (2025)
by: Rystrøm, Jonathan, et al.
Published: (2025)
Machine-Assisted Grading of Nationwide School-Leaving Essay Exams with LLMs and Statistical NLP
by: Karjus, Andres, et al.
Published: (2026)
by: Karjus, Andres, et al.
Published: (2026)
Multilingual jailbreaking of LLMs using low-resource languages
by: Marx, Dylan, et al.
Published: (2026)
by: Marx, Dylan, et al.
Published: (2026)
Similar Items
-
A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950
by: Fang, Zhao, et al.
Published: (2025) -
CulturALL: Benchmarking Multilingual and Multicultural Competence of LLMs on Grounded Tasks
by: Lin, Peiqin, et al.
Published: (2026) -
Tokenization and Representation Biases in Multilingual Models on Dialectal NLP Tasks
by: Kanjirangat, Vani, et al.
Published: (2025) -
Collective Reasoning Among LLMs: A Framework for Answer Validation Without Ground Truth
by: Davoudi, Seyed Pouyan Mousavi, et al.
Published: (2025) -
Human-Centric NLP or AI-Centric Illusion?: A Critical Investigation
by: Spencer, Piyapath T
Published: (2024)