Saved in:
| Main Authors: | Banatt, Eryk, Cheng, Jonathan, Vaidyanath, Skanda, Hwu, Tiffany |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.10998 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?
by: Yang, Lin, et al.
Published: (2026)
by: Yang, Lin, et al.
Published: (2026)
Demonstration-Free Robotic Control via LLM Agents
by: Tsui, Brian Y., et al.
Published: (2026)
by: Tsui, Brian Y., et al.
Published: (2026)
Inductive-Deductive Strategy Reuse for Multi-Turn Instructional Dialogues
by: Ou, Jiao, et al.
Published: (2024)
by: Ou, Jiao, et al.
Published: (2024)
Beyond the Strongest LLM: Multi-Turn Multi-Agent Orchestration vs. Single LLMs on Benchmarks
by: Tian, Aaron Xuxiang, et al.
Published: (2025)
by: Tian, Aaron Xuxiang, et al.
Published: (2025)
MatheMagic: Generating Dynamic Mathematics Benchmarks Robust to Memorization
by: O'Brien, Dayyán, et al.
Published: (2025)
by: O'Brien, Dayyán, et al.
Published: (2025)
MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs
by: Sirdeshmukh, Ved, et al.
Published: (2025)
by: Sirdeshmukh, Ved, et al.
Published: (2025)
Echoes of Human Malice in Agents: Benchmarking LLMs for Multi-Turn Online Harassment Attacks
by: Padhi, Trilok, et al.
Published: (2025)
by: Padhi, Trilok, et al.
Published: (2025)
Self-Supervised Inductive Logic Programming
by: Patsantzis, Stassa
Published: (2025)
by: Patsantzis, Stassa
Published: (2025)
SCICONVBENCH: Benchmarking LLMs on Multi-Turn Clarification for Task Formulation in Computational Science
by: Somasekharan, Nithin, et al.
Published: (2026)
by: Somasekharan, Nithin, et al.
Published: (2026)
Inductive Learning of Logical Theories with LLMs: An Expressivity-Graded Analysis
by: Gandarela, João Pedro, et al.
Published: (2024)
by: Gandarela, João Pedro, et al.
Published: (2024)
Satisfiability Modulo Theory Meets Inductive Logic Programming
by: Upreti, Nijesh, et al.
Published: (2025)
by: Upreti, Nijesh, et al.
Published: (2025)
Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models
by: Ruzzetti, Elena Sofia, et al.
Published: (2025)
by: Ruzzetti, Elena Sofia, et al.
Published: (2025)
Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities of LLMs
by: Cheng, Kewei, et al.
Published: (2024)
by: Cheng, Kewei, et al.
Published: (2024)
Logical Reasoning with Relation Network for Inductive Knowledge Graph Completion
by: Zhang, Qinggang, et al.
Published: (2024)
by: Zhang, Qinggang, et al.
Published: (2024)
Memorization and Knowledge Injection in Gated LLMs
by: Pan, Xu, et al.
Published: (2025)
by: Pan, Xu, et al.
Published: (2025)
MultiZebraLogic: A Multilingual Logical Reasoning Benchmark
by: Bruun, Sofie Helene, et al.
Published: (2025)
by: Bruun, Sofie Helene, et al.
Published: (2025)
Differentiable Inductive Logic Programming for Fraud Detection
by: Wolfson, Boris, et al.
Published: (2024)
by: Wolfson, Boris, et al.
Published: (2024)
From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems
by: Rahman, A M Muntasir, et al.
Published: (2024)
by: Rahman, A M Muntasir, et al.
Published: (2024)
Beyond Itinerary Planning-A Real-World Benchmark for Multi-Turn and Tool-Using Travel Tasks
by: Cheng, Xiang, et al.
Published: (2025)
by: Cheng, Xiang, et al.
Published: (2025)
Differentiable Inductive Logic Programming in High-Dimensional Space
by: Purgał, Stanisław J., et al.
Published: (2022)
by: Purgał, Stanisław J., et al.
Published: (2022)
HalluHard: A Hard Multi-Turn Hallucination Benchmark
by: Fan, Dongyang, et al.
Published: (2026)
by: Fan, Dongyang, et al.
Published: (2026)
Temporal Inductive Logic Reasoning over Hypergraphs
by: Yang, Yuan, et al.
Published: (2022)
by: Yang, Yuan, et al.
Published: (2022)
Towards Probabilistic Inductive Logic Programming with Neurosymbolic Inference and Relaxation
by: Hillerstrom, Fieke, et al.
Published: (2024)
by: Hillerstrom, Fieke, et al.
Published: (2024)
Benchmarking Correctness and Security in Multi-Turn Code Generation
by: Rawal, Ruchit, et al.
Published: (2025)
by: Rawal, Ruchit, et al.
Published: (2025)
MEMO: Memory-Augmented Model Context Optimization for Robust Multi-Turn Multi-Agent LLM Games
by: Xie, Yunfei, et al.
Published: (2026)
by: Xie, Yunfei, et al.
Published: (2026)
MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues
by: Pan, Yaning, et al.
Published: (2025)
by: Pan, Yaning, et al.
Published: (2025)
Inductive Learning for Possibilistic Logic Programs Under Stable Models
by: Hu, Hongbo, et al.
Published: (2025)
by: Hu, Hongbo, et al.
Published: (2025)
Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach
by: Meli, Daniele, et al.
Published: (2024)
by: Meli, Daniele, et al.
Published: (2024)
Modelling brain connectomes networks: Solv is a worthy competitor to hyperbolic geometry!
by: Celińska-Kopczyńska, Dorota, et al.
Published: (2024)
by: Celińska-Kopczyńska, Dorota, et al.
Published: (2024)
LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs
by: Cai, Yanan, et al.
Published: (2025)
by: Cai, Yanan, et al.
Published: (2025)
Shallow Robustness, Deep Vulnerabilities: Multi-Turn Evaluation of Medical LLMs
by: Manczak, Blazej, et al.
Published: (2025)
by: Manczak, Blazej, et al.
Published: (2025)
Are Large Language Models Memorizing Bug Benchmarks?
by: Ramos, Daniel, et al.
Published: (2024)
by: Ramos, Daniel, et al.
Published: (2024)
MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
by: Liang, Zhenwen, et al.
Published: (2024)
by: Liang, Zhenwen, et al.
Published: (2024)
Combining LLMs with Logic-Based Framework to Explain MCTS
by: An, Ziyan, et al.
Published: (2025)
by: An, Ziyan, et al.
Published: (2025)
A Relational Inductive Bias for Dimensional Abstraction in Neural Networks
by: Campbell, Declan, et al.
Published: (2024)
by: Campbell, Declan, et al.
Published: (2024)
Do LLMs Really Memorize Personally Identifiable Information? Revisiting PII Leakage with a Cue-Controlled Memorization Framework
by: Luo, Xiaoyu, et al.
Published: (2026)
by: Luo, Xiaoyu, et al.
Published: (2026)
LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification
by: Wu, Yanrui, et al.
Published: (2026)
by: Wu, Yanrui, et al.
Published: (2026)
Towards Robust Legal Reasoning: Harnessing Logical LLMs in Law
by: Kant, Manuj, et al.
Published: (2025)
by: Kant, Manuj, et al.
Published: (2025)
DialBGM: A Benchmark for Background Music Recommendation from Everyday Multi-Turn Dialogues
by: Shin, Joonhyeok, et al.
Published: (2026)
by: Shin, Joonhyeok, et al.
Published: (2026)
HICode: Hierarchical Inductive Coding with LLMs
by: Zhong, Mian, et al.
Published: (2025)
by: Zhong, Mian, et al.
Published: (2025)
Similar Items
-
MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?
by: Yang, Lin, et al.
Published: (2026) -
Demonstration-Free Robotic Control via LLM Agents
by: Tsui, Brian Y., et al.
Published: (2026) -
Inductive-Deductive Strategy Reuse for Multi-Turn Instructional Dialogues
by: Ou, Jiao, et al.
Published: (2024) -
Beyond the Strongest LLM: Multi-Turn Multi-Agent Orchestration vs. Single LLMs on Benchmarks
by: Tian, Aaron Xuxiang, et al.
Published: (2025) -
MatheMagic: Generating Dynamic Mathematics Benchmarks Robust to Memorization
by: O'Brien, Dayyán, et al.
Published: (2025)