Saved in:
| Main Authors: | Lu, Yi-Long, Zhang, Chunhui, Song, Jiajun, Fan, Lifeng, Wang, Wei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.01698 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Mind the Gap: The Divergence Between Human and LLM-Generated Tasks
by: Lu, Yi-Long, et al.
Published: (2025)
by: Lu, Yi-Long, et al.
Published: (2025)
Do AI Models Perform Human-like Abstract Reasoning Across Modalities?
by: Beger, Claas, et al.
Published: (2025)
by: Beger, Claas, et al.
Published: (2025)
NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding
by: Chan, Chunkit, et al.
Published: (2024)
by: Chan, Chunkit, et al.
Published: (2024)
Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner
by: Zhang, Chunhui, et al.
Published: (2025)
by: Zhang, Chunhui, et al.
Published: (2025)
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses
by: Amirizaniani, Maryam, et al.
Published: (2024)
by: Amirizaniani, Maryam, et al.
Published: (2024)
ToMBench: Benchmarking Theory of Mind in Large Language Models
by: Chen, Zhuang, et al.
Published: (2024)
by: Chen, Zhuang, et al.
Published: (2024)
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
by: Xu, Hainiu, et al.
Published: (2024)
by: Xu, Hainiu, et al.
Published: (2024)
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
by: Kim, Hyunwoo, et al.
Published: (2025)
by: Kim, Hyunwoo, et al.
Published: (2025)
Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-The-Fly
by: Ying, Lance, et al.
Published: (2025)
by: Ying, Lance, et al.
Published: (2025)
Simulated Annealing Enhances Theory-of-Mind Reasoning in Autoregressive Language Models
by: Hu, Xucong, et al.
Published: (2026)
by: Hu, Xucong, et al.
Published: (2026)
Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning
by: Wagner, Eitan, et al.
Published: (2024)
by: Wagner, Eitan, et al.
Published: (2024)
Understanding Artificial Theory of Mind: Perturbed Tasks and Reasoning in Large Language Models
by: Nickel, Christian, et al.
Published: (2026)
by: Nickel, Christian, et al.
Published: (2026)
Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in Large Language Models
by: Tang, Weizhi, et al.
Published: (2024)
by: Tang, Weizhi, et al.
Published: (2024)
PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking
by: Zhu, Wang Bill, et al.
Published: (2026)
by: Zhu, Wang Bill, et al.
Published: (2026)
InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles
by: Li, Zizhen, et al.
Published: (2025)
by: Li, Zizhen, et al.
Published: (2025)
The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind
by: Lupu, Andrei, et al.
Published: (2025)
by: Lupu, Andrei, et al.
Published: (2025)
Plausibility as Commonsense Reasoning: Humans Succeed, Large Language Models Do not
by: Karakaş, Sercan
Published: (2026)
by: Karakaş, Sercan
Published: (2026)
How Do Humans Write Code? Large Models Do It the Same Way Too
by: Li, Long, et al.
Published: (2024)
by: Li, Long, et al.
Published: (2024)
To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks
by: Gong, Nanxu, et al.
Published: (2026)
by: Gong, Nanxu, et al.
Published: (2026)
ToM-LM: Delegating Theory of Mind Reasoning to External Symbolic Executors in Large Language Models
by: Tang, Weizhi, et al.
Published: (2024)
by: Tang, Weizhi, et al.
Published: (2024)
Theory of Mind in Large Language Models: Assessment and Enhancement
by: Chen, Ruirui, et al.
Published: (2025)
by: Chen, Ruirui, et al.
Published: (2025)
Probing the Robustness of Theory of Mind in Large Language Models
by: Nickel, Christian, et al.
Published: (2024)
by: Nickel, Christian, et al.
Published: (2024)
Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm
by: Babarczy, Anna, et al.
Published: (2026)
by: Babarczy, Anna, et al.
Published: (2026)
EpiQAL: Benchmarking Large Language Models in Epidemiological Question Answering and Reasoning
by: Wei, Mingyang, et al.
Published: (2026)
by: Wei, Mingyang, et al.
Published: (2026)
Decompose-ToM: Enhancing Theory of Mind Reasoning in Large Language Models through Simulation and Task Decomposition
by: Sarangi, Sneheel, et al.
Published: (2025)
by: Sarangi, Sneheel, et al.
Published: (2025)
HUMORCHAIN: Theory-Guided Multi-Stage Reasoning for Interpretable Multimodal Humor Generation
by: Zhang, Jiajun, et al.
Published: (2025)
by: Zhang, Jiajun, et al.
Published: (2025)
Do Language Models Reason Across Languages?
by: Meng, Yan, et al.
Published: (2026)
by: Meng, Yan, et al.
Published: (2026)
Towards Safety Evaluations of Theory of Mind in Large Language Models
by: Aoshima, Tatsuhiro, et al.
Published: (2025)
by: Aoshima, Tatsuhiro, et al.
Published: (2025)
Beyond Labels: Aligning Large Language Models with Human-like Reasoning
by: Kabir, Muhammad Rafsan, et al.
Published: (2024)
by: Kabir, Muhammad Rafsan, et al.
Published: (2024)
DiagnosisArena: Benchmarking Diagnostic Reasoning for Large Language Models
by: Zhu, Yakun, et al.
Published: (2025)
by: Zhu, Yakun, et al.
Published: (2025)
Do Large Language Models Excel in Complex Logical Reasoning with Formal Language?
by: Jiang, Jin, et al.
Published: (2025)
by: Jiang, Jin, et al.
Published: (2025)
Can Large Models Teach Student Models to Solve Mathematical Problems Like Human Beings? A Reasoning Distillation Method via Multi-LoRA Interaction
by: Li, Xinhe, et al.
Published: (2025)
by: Li, Xinhe, et al.
Published: (2025)
JudgeBoard: Benchmarking and Enhancing Small Language Models for Reasoning Evaluation
by: Bi, Zhenyu, et al.
Published: (2025)
by: Bi, Zhenyu, et al.
Published: (2025)
What Shapes a Creative Machine Mind? Comprehensively Benchmarking Creativity in Foundation Models
by: He, Zicong, et al.
Published: (2025)
by: He, Zicong, et al.
Published: (2025)
CodeMind: Evaluating Large Language Models for Code Reasoning
by: Liu, Changshu, et al.
Published: (2024)
by: Liu, Changshu, et al.
Published: (2024)
LongReasonArena: A Long Reasoning Benchmark for Large Language Models
by: Ding, Jiayu, et al.
Published: (2025)
by: Ding, Jiayu, et al.
Published: (2025)
Theory of Mind for Multi-Agent Collaboration via Large Language Models
by: Li, Huao, et al.
Published: (2023)
by: Li, Huao, et al.
Published: (2023)
Small Language Models for Emergency Departments Decision Support: A Benchmark Study
by: Wang, Zirui, et al.
Published: (2025)
by: Wang, Zirui, et al.
Published: (2025)
An Efficient and Precise Training Data Construction Framework for Process-supervised Reward Model in Mathematical Reasoning
by: Sun, Wei, et al.
Published: (2025)
by: Sun, Wei, et al.
Published: (2025)
Implicit Probabilistic Reasoning Does Not Reflect Explicit Answers in Large Language Models
by: Mondal, Manuel, et al.
Published: (2024)
by: Mondal, Manuel, et al.
Published: (2024)
Similar Items
-
Mind the Gap: The Divergence Between Human and LLM-Generated Tasks
by: Lu, Yi-Long, et al.
Published: (2025) -
Do AI Models Perform Human-like Abstract Reasoning Across Modalities?
by: Beger, Claas, et al.
Published: (2025) -
NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding
by: Chan, Chunkit, et al.
Published: (2024) -
Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner
by: Zhang, Chunhui, et al.
Published: (2025) -
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses
by: Amirizaniani, Maryam, et al.
Published: (2024)