Saved in:
| Main Authors: | Muchovej, John, Royka, Amanda, Lee, Shane, Jara-Ettinger, Julian |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.12150 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating ASCII-Art Are Not Totally Lacking
by: Bayani, David
Published: (2023)
by: Bayani, David
Published: (2023)
Automated Meta Prompt Engineering for Alignment with the Theory of Mind
by: Baughman, Aaron, et al.
Published: (2025)
by: Baughman, Aaron, et al.
Published: (2025)
Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind
by: Xiao, Hanqi, et al.
Published: (2026)
by: Xiao, Hanqi, et al.
Published: (2026)
A Notion of Complexity for Theory of Mind via Discrete World Models
by: Huang, X. Angelo, et al.
Published: (2024)
by: Huang, X. Angelo, et al.
Published: (2024)
Single layer tiny Co$^4$ outpaces GPT-2 and GPT-BERT
by: Zain, Noor Ul, et al.
Published: (2025)
by: Zain, Noor Ul, et al.
Published: (2025)
Explore Theory of Mind: Program-guided adversarial data generation for theory of mind reasoning
by: Sclar, Melanie, et al.
Published: (2024)
by: Sclar, Melanie, et al.
Published: (2024)
Small LLMs Do Not Learn a Generalizable Theory of Mind via Reinforcement Learning
by: Sarangi, Sneheel, et al.
Published: (2025)
by: Sarangi, Sneheel, et al.
Published: (2025)
Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind
by: Ackerman, Christopher
Published: (2026)
by: Ackerman, Christopher
Published: (2026)
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
by: Chen, Junying, et al.
Published: (2024)
by: Chen, Junying, et al.
Published: (2024)
Architectural Flaw Detection in Civil Engineering Using GPT-4
by: Kumar, Saket, et al.
Published: (2024)
by: Kumar, Saket, et al.
Published: (2024)
ChatQA: Surpassing GPT-4 on Conversational QA and RAG
by: Liu, Zihan, et al.
Published: (2024)
by: Liu, Zihan, et al.
Published: (2024)
ArabianGPT: Native Arabic GPT-based Large Language Model
by: Koubaa, Anis, et al.
Published: (2024)
by: Koubaa, Anis, et al.
Published: (2024)
Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs
by: Bergsma, Shane, et al.
Published: (2025)
by: Bergsma, Shane, et al.
Published: (2025)
MMToM-QA: Multimodal Theory of Mind Question Answering
by: Jin, Chuanyang, et al.
Published: (2024)
by: Jin, Chuanyang, et al.
Published: (2024)
Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension
by: Vatsal, Shubham, et al.
Published: (2024)
by: Vatsal, Shubham, et al.
Published: (2024)
Can We Count on LLMs? The Fixed-Effect Fallacy and Claims of GPT-4 Capabilities
by: Ball, Thomas, et al.
Published: (2024)
by: Ball, Thomas, et al.
Published: (2024)
MediaMind: Revolutionizing Media Monitoring using Agentification
by: Gunduz, Ahmet, et al.
Published: (2025)
by: Gunduz, Ahmet, et al.
Published: (2025)
Mini Minds: Exploring Bebeshka and Zlata Baby Models
by: Proskurina, Irina, et al.
Published: (2023)
by: Proskurina, Irina, et al.
Published: (2023)
Benchmarking ChatGPT on Algorithmic Reasoning
by: McLeish, Sean, et al.
Published: (2024)
by: McLeish, Sean, et al.
Published: (2024)
Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning
by: Ahmed, Nesreen K., et al.
Published: (2026)
by: Ahmed, Nesreen K., et al.
Published: (2026)
Low-Resource Languages Jailbreak GPT-4
by: Yong, Zheng-Xin, et al.
Published: (2023)
by: Yong, Zheng-Xin, et al.
Published: (2023)
Using Hallucinations to Bypass GPT4's Filter
by: Lemkin, Benjamin
Published: (2024)
by: Lemkin, Benjamin
Published: (2024)
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
by: Zhao, Justin, et al.
Published: (2024)
by: Zhao, Justin, et al.
Published: (2024)
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
by: Shi, Haojun, et al.
Published: (2024)
by: Shi, Haojun, et al.
Published: (2024)
Geometric-Averaged Preference Optimization for Soft Preference Labels
by: Furuta, Hiroki, et al.
Published: (2024)
by: Furuta, Hiroki, et al.
Published: (2024)
Can we trust the evaluation on ChatGPT?
by: Aiyappa, Rachith, et al.
Published: (2023)
by: Aiyappa, Rachith, et al.
Published: (2023)
NExT-GPT: Any-to-Any Multimodal LLM
by: Wu, Shengqiong, et al.
Published: (2023)
by: Wu, Shengqiong, et al.
Published: (2023)
Universal Neurons in GPT2 Language Models
by: Gurnee, Wes, et al.
Published: (2024)
by: Gurnee, Wes, et al.
Published: (2024)
HumanEval on Latest GPT Models -- 2024
by: Li, Daniel, et al.
Published: (2024)
by: Li, Daniel, et al.
Published: (2024)
Fairness of ChatGPT
by: Li, Yunqi, et al.
Published: (2023)
by: Li, Yunqi, et al.
Published: (2023)
Evaluating GPT's Capability in Identifying Stages of Cognitive Impairment from Electronic Health Data
by: Leng, Yu, et al.
Published: (2025)
by: Leng, Yu, et al.
Published: (2025)
Mind the Gap: A Review of Arabic Post-Training Datasets and Their Limitations
by: Alkhowaiter, Mohammed, et al.
Published: (2025)
by: Alkhowaiter, Mohammed, et al.
Published: (2025)
Mechanistic Interpretability of GPT-like Models on Summarization Tasks
by: Mishra, Anurag
Published: (2025)
by: Mishra, Anurag
Published: (2025)
If in a Crowdsourced Data Annotation Pipeline, a GPT-4
by: He, Zeyu, et al.
Published: (2024)
by: He, Zeyu, et al.
Published: (2024)
Meta-GPT: Decoding the Metasurface Genome with Generative Artificial Intelligence
by: Dang, David, et al.
Published: (2025)
by: Dang, David, et al.
Published: (2025)
How Prevalent is Gender Bias in ChatGPT? -- Exploring German and English ChatGPT Responses
by: Urchs, Stefanie, et al.
Published: (2023)
by: Urchs, Stefanie, et al.
Published: (2023)
MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters
by: Dada, Amin, et al.
Published: (2025)
by: Dada, Amin, et al.
Published: (2025)
Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment
by: Zhang, Jiazheng, et al.
Published: (2025)
by: Zhang, Jiazheng, et al.
Published: (2025)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners
by: Xiao, Yijia, et al.
Published: (2023)
by: Xiao, Yijia, et al.
Published: (2023)
Enhancing Antibiotic Stewardship using a Natural Language Approach for Better Feature Representation
by: Lee, Simon A., et al.
Published: (2024)
by: Lee, Simon A., et al.
Published: (2024)
Similar Items
-
Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating ASCII-Art Are Not Totally Lacking
by: Bayani, David
Published: (2023) -
Automated Meta Prompt Engineering for Alignment with the Theory of Mind
by: Baughman, Aaron, et al.
Published: (2025) -
Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind
by: Xiao, Hanqi, et al.
Published: (2026) -
A Notion of Complexity for Theory of Mind via Discrete World Models
by: Huang, X. Angelo, et al.
Published: (2024) -
Single layer tiny Co$^4$ outpaces GPT-2 and GPT-BERT
by: Zain, Noor Ul, et al.
Published: (2025)