:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Afane, Mohamed, Robitschek, Emily, Ouyang, Derek, Ho, Daniel E.
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.19895
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Benchmarking Legal RAG: The Promise and Limits of AI Statutory Surveys
by: Afane, Mohamed, et al.
Published: (2026)

Can LLMs Help Allocate Public Health Resources? A Case Study on Childhood Lead Testing
by: Afane, Mohamed, et al.
Published: (2025)

A Progressive Visual-Logic-Aligned Framework for Ride-Hailing Adjudication
by: Wu, Weiming, et al.
Published: (2026)

ATP: Adaptive Threshold Pruning for Efficient Data Encoding in Quantum Neural Networks
by: Afane, Mohamed, et al.
Published: (2025)

Deciding When Not to Decide: Indeterminacy-Aware Intrusion Detection with NeutroSENSE
by: Al-Masri, Eyhab
Published: (2025)

Critical Windows of Complexity Control: When Transformers Decide to Reason or Memorize
by: Ali, Sarwan
Published: (2026)

Next-Generation Phishing: How LLM Agents Empower Cyber Attackers
by: Afane, Khalifa, et al.
Published: (2024)

Learning to Decide with AI Assistance under Human-Alignment
by: Benz, Nina Corvelo, et al.
Published: (2026)

Automating Adjudication of Cardiovascular Events Using Large Language Models
by: Sivarajkumar, Sonish, et al.
Published: (2025)

Adjudicator: Correcting Noisy Labels with a KG-Informed Council of LLM Agents
by: You, Doohee, et al.
Published: (2025)

TraceScope: Interactive URL Triage via Decoupled Checklist Adjudication
by: Zhang, Haolin, et al.
Published: (2026)

From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks
by: Stephan, Andreas, et al.
Published: (2024)

ConsistencyAI: A Benchmark to Assess LLMs' Factual Consistency When Responding to Different Demographic Groups
by: Banyas, Peter, et al.
Published: (2025)

AIRA_2: Overcoming Bottlenecks in AI Research Agents
by: Hambardzumyan, Karen, et al.
Published: (2026)

Do Benchmarks Underestimate LLM Performance? Evaluating Hallucination Detection With LLM-First Human-Adjudicated Assessment
by: Atasoy, I. F., et al.
Published: (2026)

The Prompt War: How AI Decides on a Military Intervention
by: Chupilkin, Maxim
Published: (2025)

When Language Shapes Thought: Cross-Lingual Transfer of Factual Knowledge in Question Answering
by: Kang, Eojin, et al.
Published: (2025)

Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation
by: Ramprasad, Sanjana, et al.
Published: (2024)

Learning to Decide with Just Enough: Information-Theoretic Context Summarization for CMDPs
by: Liu, Peidong, et al.
Published: (2025)

Do Proactive Agents Really Need an LLM to Decide When to Wake and What to Anchor?
by: Liu, Xiaoze, et al.
Published: (2026)

SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization
by: Mishra, Prakamya, et al.
Published: (2024)

Artificial Intelligence and Child Custody Adjudication: A Comparative Study of Estonia and Nigeria
by: Folajuwon-Banjo, Emilia Oluwaseun
Published: (2025)

Reason2Decide: Rationale-Driven Multi-Task Learning
by: Hasan, H M Quamran, et al.
Published: (2025)

On the Size Complexity and Decidability of First-Order Progression
by: Classen, Jens, et al.
Published: (2026)

Deciding the Satisfiability of Combined Qualitative Constraint Networks
by: Cohen-Solal, Quentin, et al.
Published: (2026)

Decidable By Construction: Design-Time Verification for Trustworthy AI
by: Haynes, Houston
Published: (2026)

Factuality on Demand: Controlling the Factuality-Informativeness Trade-off in Text Generation
by: Gong, Ziwei, et al.
Published: (2026)

Deciding how to respond: A deliberative framework to guide policymaker responses to AI systems
by: Fourie, Willem
Published: (2025)

Do I Really Know? Learning Factual Self-Verification for Hallucination Reduction
by: Altinisik, Enes, et al.
Published: (2026)

Locomo-Plus: Beyond-Factual Cognitive Memory Evaluation Framework for LLM Agents
by: Li, Yifei, et al.
Published: (2026)

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs
by: Iqbal, Hasan, et al.
Published: (2024)

MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
by: Ning, Yucheng, et al.
Published: (2025)

GeoDecider: A Coarse-to-Fine Agentic Workflow for Explainable Lithology Classification
by: Wang, Jiahao, et al.
Published: (2026)

Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner
by: Zhang, Chunhui, et al.
Published: (2025)

Knowledge Authoring with Factual English, Rules, and Actions
by: Wang, Yuheng
Published: (2024)

Factuality of Large Language Models: A Survey
by: Wang, Yuxia, et al.
Published: (2024)

Evaluating Reliability Asymmetries in Chinese Factual Search and AI Answers
by: Liu, Geng, et al.
Published: (2025)

Knowledgeable In-Context Tuning: Exploring and Exploiting Factual Knowledge for In-Context Learning
by: Wang, Jianing, et al.
Published: (2023)

Tracking vs. Deciding: The Dual-Capability Bottleneck in Searchless Chess Transformers
by: Li, Quanhao, et al.
Published: (2026)

Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness
by: Bi, Baolong, et al.
Published: (2024)