Saved in:
| Main Authors: | Şenol, Ali, Agrawal, Garima, Liu, Huan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.24661 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Domain Knowledge-Enhanced LLMs for Fraud and Concept Drift Detection
by: Şenol, Ali, et al.
Published: (2025)
by: Şenol, Ali, et al.
Published: (2025)
Joint Detection of Fraud and Concept Drift inOnline Conversations with LLM-Assisted Judgment
by: Senol, Ali, et al.
Published: (2025)
by: Senol, Ali, et al.
Published: (2025)
RedditESS: A Mental Health Social Support Interaction Dataset -- Understanding Effective Social Support to Refine AI-Driven Support Tools
by: Alghamdi, Zeyad, et al.
Published: (2025)
by: Alghamdi, Zeyad, et al.
Published: (2025)
Breaking Thought Patterns: A Multi-Dimensional Reasoning Framework for LLMs
by: Tang, Xintong, et al.
Published: (2025)
by: Tang, Xintong, et al.
Published: (2025)
A Survey of AI-generated Text Forensic Systems: Detection, Attribution, and Characterization
by: Kumarage, Tharindu, et al.
Published: (2024)
by: Kumarage, Tharindu, et al.
Published: (2024)
Beyond-RAG: Question Identification and Answer Generation in Real-Time Conversations
by: Agrawal, Garima, et al.
Published: (2024)
by: Agrawal, Garima, et al.
Published: (2024)
Graphing the Truth: Structured Visualizations for Automated Hallucination Detection in LLMs
by: Agrawal, Tanmay
Published: (2025)
by: Agrawal, Tanmay
Published: (2025)
TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy Notes
by: Shah, Raj Sanjay, et al.
Published: (2025)
by: Shah, Raj Sanjay, et al.
Published: (2025)
MTQ-Eval: Multilingual Text Quality Evaluation for Language Models
by: Pokharel, Rhitabrat, et al.
Published: (2025)
by: Pokharel, Rhitabrat, et al.
Published: (2025)
MTCMB: A Multi-Task Benchmark Framework for Evaluating LLMs on Knowledge, Reasoning, and Safety in Traditional Chinese Medicine
by: Kong, Shufeng, et al.
Published: (2025)
by: Kong, Shufeng, et al.
Published: (2025)
Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to Clarity
by: Chen, Ping, et al.
Published: (2025)
by: Chen, Ping, et al.
Published: (2025)
Measuring AI Reasoning: A Guide for Researchers
by: Nwadike, Munachiso Samuel, et al.
Published: (2026)
by: Nwadike, Munachiso Samuel, et al.
Published: (2026)
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey
by: Agrawal, Garima, et al.
Published: (2023)
by: Agrawal, Garima, et al.
Published: (2023)
Can LLMs perform structured graph reasoning?
by: Agrawal, Palaash, et al.
Published: (2024)
by: Agrawal, Palaash, et al.
Published: (2024)
TriEx: A Game-based Tri-View Framework for Explaining Internal Reasoning in Multi-Agent LLMs
by: Wang, Ziyi, et al.
Published: (2026)
by: Wang, Ziyi, et al.
Published: (2026)
MultiNRC: A Challenging and Native Multilingual Reasoning Evaluation Benchmark for LLMs
by: Fabbri, Alexander R., et al.
Published: (2025)
by: Fabbri, Alexander R., et al.
Published: (2025)
MedRedFlag: Investigating how LLMs Redirect Misconceptions in Real-World Health Communication
by: Sambara, Sraavya, et al.
Published: (2026)
by: Sambara, Sraavya, et al.
Published: (2026)
POLIS-Bench: Towards Multi-Dimensional Evaluation of LLMs for Bilingual Policy Tasks in Governmental Scenarios
by: Yang, Tingyue, et al.
Published: (2025)
by: Yang, Tingyue, et al.
Published: (2025)
Fair Summarization: Bridging Quality and Diversity in Extractive Summaries
by: Nezhad, Sina Bagheri, et al.
Published: (2024)
by: Nezhad, Sina Bagheri, et al.
Published: (2024)
Do LLMs Understand Ambiguity in Text? A Case Study in Open-world Question Answering
by: Keluskar, Aryan, et al.
Published: (2024)
by: Keluskar, Aryan, et al.
Published: (2024)
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
by: Zhao, Chengshuai, et al.
Published: (2025)
by: Zhao, Chengshuai, et al.
Published: (2025)
Are Your LLMs Capable of Stable Reasoning?
by: Liu, Junnan, et al.
Published: (2024)
by: Liu, Junnan, et al.
Published: (2024)
How Emotion Shapes the Behavior of LLMs and Agents: A Mechanistic Study
by: Sun, Moran, et al.
Published: (2026)
by: Sun, Moran, et al.
Published: (2026)
What Defines Good Reasoning in LLMs? Dissecting Reasoning Steps with Multi-Aspect Evaluation
by: Do, Heejin, et al.
Published: (2025)
by: Do, Heejin, et al.
Published: (2025)
Rethinking Cross-lingual Alignment: Balancing Transfer and Cultural Erasure in Multilingual LLMs
by: Han, HyoJung, et al.
Published: (2025)
by: Han, HyoJung, et al.
Published: (2025)
Understanding Position Bias Effects on Fairness in Social Multi-Document Summarization
by: Olabisi, Olubusayo, et al.
Published: (2024)
by: Olabisi, Olubusayo, et al.
Published: (2024)
Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs
by: Li, Hanqing, et al.
Published: (2025)
by: Li, Hanqing, et al.
Published: (2025)
Weaker LLMs' Opinions Also Matter: Mixture of Opinions Enhances LLM's Mathematical Reasoning
by: Chen, Yanan, et al.
Published: (2025)
by: Chen, Yanan, et al.
Published: (2025)
Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs
by: Liu, Jinbo, et al.
Published: (2025)
by: Liu, Jinbo, et al.
Published: (2025)
A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models
by: Jones, Jaylen, et al.
Published: (2024)
by: Jones, Jaylen, et al.
Published: (2024)
Hypothesis-Driven Feature Manifold Analysis in LLMs via Supervised Multi-Dimensional Scaling
by: Tiblias, Federico, et al.
Published: (2025)
by: Tiblias, Federico, et al.
Published: (2025)
Empowering LLMs with Logical Reasoning: A Comprehensive Survey
by: Cheng, Fengxiang, et al.
Published: (2025)
by: Cheng, Fengxiang, et al.
Published: (2025)
Decoding Time Series with LLMs: A Multi-Agent Framework for Cross-Domain Annotation
by: Lin, Minhua, et al.
Published: (2024)
by: Lin, Minhua, et al.
Published: (2024)
From Query to Logic: Ontology-Driven Multi-Hop Reasoning in LLMs
by: Bian, Haonan, et al.
Published: (2025)
by: Bian, Haonan, et al.
Published: (2025)
Shattering the Shortcut: A Topology-Regularized Benchmark for Multi-hop Medical Reasoning in LLMs
by: Zi, Xing, et al.
Published: (2026)
by: Zi, Xing, et al.
Published: (2026)
TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios
by: Wei, Shaohang, et al.
Published: (2025)
by: Wei, Shaohang, et al.
Published: (2025)
Beyond Next Word Prediction: Developing Comprehensive Evaluation Frameworks for measuring LLM performance on real world applications
by: Agrawal, Vishakha, et al.
Published: (2025)
by: Agrawal, Vishakha, et al.
Published: (2025)
Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs
by: Kietkajornrit, Auksarapak, et al.
Published: (2026)
by: Kietkajornrit, Auksarapak, et al.
Published: (2026)
LLMs for Relational Reasoning: How Far are We?
by: Li, Zhiming, et al.
Published: (2024)
by: Li, Zhiming, et al.
Published: (2024)
MMORF: A Multi-agent Framework for Designing Multi-objective Retrosynthesis Planning Systems
by: Baker, Frazier N., et al.
Published: (2026)
by: Baker, Frazier N., et al.
Published: (2026)
Similar Items
-
Domain Knowledge-Enhanced LLMs for Fraud and Concept Drift Detection
by: Şenol, Ali, et al.
Published: (2025) -
Joint Detection of Fraud and Concept Drift inOnline Conversations with LLM-Assisted Judgment
by: Senol, Ali, et al.
Published: (2025) -
RedditESS: A Mental Health Social Support Interaction Dataset -- Understanding Effective Social Support to Refine AI-Driven Support Tools
by: Alghamdi, Zeyad, et al.
Published: (2025) -
Breaking Thought Patterns: A Multi-Dimensional Reasoning Framework for LLMs
by: Tang, Xintong, et al.
Published: (2025) -
A Survey of AI-generated Text Forensic Systems: Detection, Attribution, and Characterization
by: Kumarage, Tharindu, et al.
Published: (2024)