Saved in:
| Main Authors: | Robertson, Alex, Liang, Huizhi, Gani, Mahbub, Kumar, Rohit, Rajamohan, Srijith |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.19643 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Training for Compositional Sensitivity Reduces Dense Retrieval Generalization
by: Ralev, Radoslav, et al.
Published: (2026)
by: Ralev, Radoslav, et al.
Published: (2026)
BPO: Towards Balanced Preference Optimization between Knowledge Breadth and Depth in Alignment
by: Wang, Sizhe, et al.
Published: (2024)
by: Wang, Sizhe, et al.
Published: (2024)
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework
by: Sansford, Hannah, et al.
Published: (2024)
by: Sansford, Hannah, et al.
Published: (2024)
KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs
by: Markowitz, Elan, et al.
Published: (2025)
by: Markowitz, Elan, et al.
Published: (2025)
D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs
by: Ding, Yue, et al.
Published: (2025)
by: Ding, Yue, et al.
Published: (2025)
MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations
by: Lavrinovics, Ernests, et al.
Published: (2025)
by: Lavrinovics, Ernests, et al.
Published: (2025)
MHGraphBench: Knowledge Graph-Grounded Benchmarking of Mental Health Knowledge in Large Language Models
by: Liu, Weixin, et al.
Published: (2026)
by: Liu, Weixin, et al.
Published: (2026)
FinReflectKG -- HalluBench: GraphRAG Hallucination Benchmark for Financial Question Answering Systems
by: Kumar, Mahesh, et al.
Published: (2026)
by: Kumar, Mahesh, et al.
Published: (2026)
SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
by: Liu, Zhiqiang, et al.
Published: (2025)
by: Liu, Zhiqiang, et al.
Published: (2025)
From Hallucinations to Facts: Enhancing Language Models with Curated Knowledge Graphs
by: Joshi, Ratnesh Kumar, et al.
Published: (2024)
by: Joshi, Ratnesh Kumar, et al.
Published: (2024)
Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data
by: Gill, Waris, et al.
Published: (2025)
by: Gill, Waris, et al.
Published: (2025)
Do These LLM Benchmarks Agree? Fixing Benchmark Evaluation with BenchBench
by: Perlitz, Yotam, et al.
Published: (2024)
by: Perlitz, Yotam, et al.
Published: (2024)
Enhancing Knowledge Graph Construction: Evaluating with Emphasis on Hallucination, Omission, and Graph Similarity Metrics
by: Ghanem, Hussam, et al.
Published: (2025)
by: Ghanem, Hussam, et al.
Published: (2025)
Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation
by: Xie, Pengzhen, et al.
Published: (2025)
by: Xie, Pengzhen, et al.
Published: (2025)
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
by: Nguyen, Hieu, et al.
Published: (2025)
by: Nguyen, Hieu, et al.
Published: (2025)
CE-Bench: Towards a Reliable Contrastive Evaluation Benchmark of Interpretability of Sparse Autoencoders
by: Gulko, Alex, et al.
Published: (2025)
by: Gulko, Alex, et al.
Published: (2025)
Ensemble based approach to quantifying uncertainty of LLM based classifications
by: Rajamohan, Srijith, et al.
Published: (2025)
by: Rajamohan, Srijith, et al.
Published: (2025)
OneEval: Benchmarking LLM Knowledge-intensive Reasoning over Diverse Knowledge Bases
by: Chen, Yongrui, et al.
Published: (2025)
by: Chen, Yongrui, et al.
Published: (2025)
On Characterizations for Language Generation: Interplay of Hallucinations, Breadth, and Stability
by: Kalavasis, Alkis, et al.
Published: (2024)
by: Kalavasis, Alkis, et al.
Published: (2024)
Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering
by: Pusch, Larissa, et al.
Published: (2024)
by: Pusch, Larissa, et al.
Published: (2024)
LastingBench: Defend Benchmarks Against Knowledge Leakage
by: Fang, Yixiong, et al.
Published: (2025)
by: Fang, Yixiong, et al.
Published: (2025)
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs
by: Bharadwaj, Rohit, et al.
Published: (2024)
by: Bharadwaj, Rohit, et al.
Published: (2024)
Agent-as-a-Graph: Knowledge Graph-Based Tool and Agent Retrieval for LLM Multi-Agent Systems
by: Nizar, Faheem, et al.
Published: (2025)
by: Nizar, Faheem, et al.
Published: (2025)
The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination
by: Zhang, Yuji, et al.
Published: (2025)
by: Zhang, Yuji, et al.
Published: (2025)
nicolay-r at SemEval-2024 Task 3: Using Flan-T5 for Reasoning Emotion Cause in Conversations with Chain-of-Thought on Emotion States
by: Rusnachenko, Nicolay, et al.
Published: (2024)
by: Rusnachenko, Nicolay, et al.
Published: (2024)
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
by: Hron, Jiri, et al.
Published: (2024)
by: Hron, Jiri, et al.
Published: (2024)
LLM-Guided Knowledge Distillation for Temporal Knowledge Graph Reasoning
by: Xing, Wang, et al.
Published: (2026)
by: Xing, Wang, et al.
Published: (2026)
Characterizing Knowledge Graph Tasks in LLM Benchmarks Using Cognitive Complexity Frameworks
by: Todorovikj, Sara, et al.
Published: (2025)
by: Todorovikj, Sara, et al.
Published: (2025)
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
by: Feng, Shangbin, et al.
Published: (2024)
by: Feng, Shangbin, et al.
Published: (2024)
EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge
by: Zaporojets, Klim, et al.
Published: (2025)
by: Zaporojets, Klim, et al.
Published: (2025)
MMKU-Bench: A Multimodal Update Benchmark for Diverse Visual Knowledge
by: Fu, Baochen, et al.
Published: (2026)
by: Fu, Baochen, et al.
Published: (2026)
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey
by: Agrawal, Garima, et al.
Published: (2023)
by: Agrawal, Garima, et al.
Published: (2023)
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
by: Du, Yuntao, et al.
Published: (2025)
by: Du, Yuntao, et al.
Published: (2025)
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
by: Zhu, Yanxu, et al.
Published: (2024)
by: Zhu, Yanxu, et al.
Published: (2024)
NCL-UoR at SemEval-2026 Task 5: Embedding-Based Methods, Fine-Tuning, and LLMs for Word Sense Plausibility Rating
by: Wu, Tong, et al.
Published: (2026)
by: Wu, Tong, et al.
Published: (2026)
SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization
by: Yuan, Jiarui, et al.
Published: (2026)
by: Yuan, Jiarui, et al.
Published: (2026)
Grounding LLM Reasoning with Knowledge Graphs
by: Amayuelas, Alfonso, et al.
Published: (2025)
by: Amayuelas, Alfonso, et al.
Published: (2025)
HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
by: Son, Guijin, et al.
Published: (2023)
by: Son, Guijin, et al.
Published: (2023)
Knowledge Verification to Nip Hallucination in the Bud
by: Wan, Fanqi, et al.
Published: (2024)
by: Wan, Fanqi, et al.
Published: (2024)
ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM
by: Su, Zhaochen, et al.
Published: (2024)
by: Su, Zhaochen, et al.
Published: (2024)
Similar Items
-
Training for Compositional Sensitivity Reduces Dense Retrieval Generalization
by: Ralev, Radoslav, et al.
Published: (2026) -
BPO: Towards Balanced Preference Optimization between Knowledge Breadth and Depth in Alignment
by: Wang, Sizhe, et al.
Published: (2024) -
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework
by: Sansford, Hannah, et al.
Published: (2024) -
KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs
by: Markowitz, Elan, et al.
Published: (2025) -
D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs
by: Ding, Yue, et al.
Published: (2025)