:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Nananukul, Navapat, Zhang, Yue, Lee, Ryan, Boxer, Eric, May, Jonathan, Gogate, Vibhav Giridhar, Pujara, Jay, Kejriwal, Mayank
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.01530
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

HALO: An Ontology for Representing and Categorizing Hallucinations in Large Language Models
by: Nananukul, Navapat, et al.
Published: (2023)

ClinicBot: A Guideline-Grounded Clinical Chatbot with Prioritized Evidence RAG and Verifiable Citations
by: Nananukul, Navapat, et al.
Published: (2026)

An Analysis of Artificial Intelligence Adoption in NIH-Funded Research
by: Nananukul, Navapat, et al.
Published: (2026)

Cost-Efficient Prompt Engineering for Unsupervised Entity Resolution
by: Nananukul, Navapat, et al.
Published: (2023)

What if Red Can Talk? Dynamic Dialogue Generation Using Large Language Models
by: Nananukul, Navapat, et al.
Published: (2024)

Toward Better Temporal Structures for Geopolitical Events Forecasting
by: Ahrabian, Kian, et al.
Published: (2026)

Learning to Guide Local Search for MPE Inference in Probabilistic Graphical Models
by: Malhotra, Brij, et al.
Published: (2026)

Learning to Condition: A Neural Heuristic for Scalable MPE Inference
by: Malhotra, Brij, et al.
Published: (2025)

Unsupervised Federated Domain Adaptation for Segmentation of MRI Images
by: Nananukul, Navapat, et al.
Published: (2024)

Fragile Thoughts: How Large Language Models Handle Chain-of-Thought Perturbations
by: Aravindan, Ashwath Vaithinathan, et al.
Published: (2026)

SelECT-SQL: Self-correcting ensemble Chain-of-Thought for Text-to-SQL
by: Shen, Ke, et al.
Published: (2024)

Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven Optimization
by: Zhang, Yue, et al.
Published: (2024)

GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning
by: Tang, Zhisheng, et al.
Published: (2024)

Which Questions Improve Learning the Most? Utility Estimation of Questions with LM-based Simulations
by: Lee, Dong-Ho, et al.
Published: (2025)

Can Video Large Multimodal Models Think Like Doubters-or Double-Down: A Study on Defeasible Video Entailment
by: Zhang, Yue, et al.
Published: (2025)

Neural Network Approximators for Marginal MAP in Probabilistic Circuits
by: Arya, Shivvrat, et al.
Published: (2024)

Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification
by: Arya, Shivvrat, et al.
Published: (2024)

Learning to Solve the Constrained Most Probable Explanation Task in Probabilistic Graphical Models
by: Arya, Shivvrat, et al.
Published: (2024)

Navigating Semantic Relations: Challenges for Language Models in Abstract Common-Sense Reasoning
by: Gawin, Cole, et al.
Published: (2025)

Defining and Evaluating Decision and Composite Risk in Language Models Applied to Natural Language Inference
by: Shen, Ke, et al.
Published: (2024)

Generating Novelty in Open-World Multi-Agent Strategic Board Games
by: Kejriwal, Mayank, et al.
Published: (2025)

A Compound AI Agent for Conversational Grant Discovery
by: Tang, Zhisheng, et al.
Published: (2026)

Modeling and Simulating Agent-Based City Migration Using Conway's Game of Life
by: Deng, Bruce, et al.
Published: (2024)

Theory Discovery in Social Networks: Automating ERGM Specification with Large Language Models
by: Sun, Yidan, et al.
Published: (2026)

Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystem
by: Maganur, Shama, et al.
Published: (2026)

Modeling Inequality in Complex Networks of Strategic Agents using Iterative Game-Theoretic Transactions
by: Kejriwal, Mayank, et al.
Published: (2025)

Humanlike Cognitive Patterns as Emergent Phenomena in Large Language Models
by: Tang, Zhisheng, et al.
Published: (2024)

Second Guess: Detecting Uncertainty Through Abstention and Answer Stability in Small Language Models
by: Aravindan, Ashwath Vaithinathan, et al.
Published: (2026)

Towards Scene Graph Anticipation
by: Peddi, Rohith, et al.
Published: (2024)

An Evaluation of Estimative Uncertainty in Large Language Models
by: Tang, Zhisheng, et al.
Published: (2024)

Is persona enough for personality? Using ChatGPT to reconstruct an agent's latent personality from simple descriptions
by: Ji, Yongyi, et al.
Published: (2024)

Characterizing Robustness of Strategies to Novelty in Zero-Sum Open Worlds
by: Kejriwal, Mayank, et al.
Published: (2026)

Towards Unbiased and Robust Spatio-Temporal Scene Graph Generation and Anticipation
by: Peddi, Rohith, et al.
Published: (2024)

Code-Driven Planning in Grid Worlds with Large Language Models
by: Aravindan, Ashwath Vaithinathan, et al.
Published: (2025)

Grasping Trajectory Optimization with Point Clouds
by: Xiang, Yu, et al.
Published: (2024)

A Skill-augmented Agentic Framework and Benchmark for Multi-Video Understanding
by: Zhang, Yue, et al.
Published: (2026)

Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis Using LLMs and Text Classification
by: Patil, Vishal, et al.
Published: (2026)

From Query to Logic: Ontology-Driven Multi-Hop Reasoning in LLMs
by: Bian, Haonan, et al.
Published: (2025)

Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
by: Singh, Joykirat, et al.
Published: (2024)

Towards Spatio-Temporal World Scene Graph Generation from Monocular Videos
by: Peddi, Rohith, et al.
Published: (2026)