:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Banatt, Eryk, Cheng, Jonathan, Vaidyanath, Skanda, Hwu, Tiffany
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.10998
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?
by: Yang, Lin, et al.
Published: (2026)

Demonstration-Free Robotic Control via LLM Agents
by: Tsui, Brian Y., et al.
Published: (2026)

Inductive-Deductive Strategy Reuse for Multi-Turn Instructional Dialogues
by: Ou, Jiao, et al.
Published: (2024)

Beyond the Strongest LLM: Multi-Turn Multi-Agent Orchestration vs. Single LLMs on Benchmarks
by: Tian, Aaron Xuxiang, et al.
Published: (2025)

MatheMagic: Generating Dynamic Mathematics Benchmarks Robust to Memorization
by: O'Brien, Dayyán, et al.
Published: (2025)

MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs
by: Sirdeshmukh, Ved, et al.
Published: (2025)

Echoes of Human Malice in Agents: Benchmarking LLMs for Multi-Turn Online Harassment Attacks
by: Padhi, Trilok, et al.
Published: (2025)

Self-Supervised Inductive Logic Programming
by: Patsantzis, Stassa
Published: (2025)

SCICONVBENCH: Benchmarking LLMs on Multi-Turn Clarification for Task Formulation in Computational Science
by: Somasekharan, Nithin, et al.
Published: (2026)

Inductive Learning of Logical Theories with LLMs: An Expressivity-Graded Analysis
by: Gandarela, João Pedro, et al.
Published: (2024)

Satisfiability Modulo Theory Meets Inductive Logic Programming
by: Upreti, Nijesh, et al.
Published: (2025)

Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models
by: Ruzzetti, Elena Sofia, et al.
Published: (2025)

Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities of LLMs
by: Cheng, Kewei, et al.
Published: (2024)

Logical Reasoning with Relation Network for Inductive Knowledge Graph Completion
by: Zhang, Qinggang, et al.
Published: (2024)

Memorization and Knowledge Injection in Gated LLMs
by: Pan, Xu, et al.
Published: (2025)

MultiZebraLogic: A Multilingual Logical Reasoning Benchmark
by: Bruun, Sofie Helene, et al.
Published: (2025)

Differentiable Inductive Logic Programming for Fraud Detection
by: Wolfson, Boris, et al.
Published: (2024)

From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems
by: Rahman, A M Muntasir, et al.
Published: (2024)

Beyond Itinerary Planning-A Real-World Benchmark for Multi-Turn and Tool-Using Travel Tasks
by: Cheng, Xiang, et al.
Published: (2025)

Differentiable Inductive Logic Programming in High-Dimensional Space
by: Purgał, Stanisław J., et al.
Published: (2022)

HalluHard: A Hard Multi-Turn Hallucination Benchmark
by: Fan, Dongyang, et al.
Published: (2026)

Temporal Inductive Logic Reasoning over Hypergraphs
by: Yang, Yuan, et al.
Published: (2022)

Towards Probabilistic Inductive Logic Programming with Neurosymbolic Inference and Relaxation
by: Hillerstrom, Fieke, et al.
Published: (2024)

Benchmarking Correctness and Security in Multi-Turn Code Generation
by: Rawal, Ruchit, et al.
Published: (2025)

MEMO: Memory-Augmented Model Context Optimization for Robust Multi-Turn Multi-Agent LLM Games
by: Xie, Yunfei, et al.
Published: (2026)

MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues
by: Pan, Yaning, et al.
Published: (2025)

Inductive Learning for Possibilistic Logic Programs Under Stable Models
by: Hu, Hongbo, et al.
Published: (2025)

Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach
by: Meli, Daniele, et al.
Published: (2024)

Modelling brain connectomes networks: Solv is a worthy competitor to hyperbolic geometry!
by: Celińska-Kopczyńska, Dorota, et al.
Published: (2024)

LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs
by: Cai, Yanan, et al.
Published: (2025)

Shallow Robustness, Deep Vulnerabilities: Multi-Turn Evaluation of Medical LLMs
by: Manczak, Blazej, et al.
Published: (2025)

Are Large Language Models Memorizing Bug Benchmarks?
by: Ramos, Daniel, et al.
Published: (2024)

MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
by: Liang, Zhenwen, et al.
Published: (2024)

Combining LLMs with Logic-Based Framework to Explain MCTS
by: An, Ziyan, et al.
Published: (2025)

A Relational Inductive Bias for Dimensional Abstraction in Neural Networks
by: Campbell, Declan, et al.
Published: (2024)

Do LLMs Really Memorize Personally Identifiable Information? Revisiting PII Leakage with a Cue-Controlled Memorization Framework
by: Luo, Xiaoyu, et al.
Published: (2026)

LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification
by: Wu, Yanrui, et al.
Published: (2026)

Towards Robust Legal Reasoning: Harnessing Logical LLMs in Law
by: Kant, Manuj, et al.
Published: (2025)

DialBGM: A Benchmark for Background Music Recommendation from Everyday Multi-Turn Dialogues
by: Shin, Joonhyeok, et al.
Published: (2026)

HICode: Hierarchical Inductive Coding with LLMs
by: Zhong, Mian, et al.
Published: (2025)