:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Muchovej, John, Royka, Amanda, Lee, Shane, Jara-Ettinger, Julian
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2602.12150
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating ASCII-Art Are Not Totally Lacking
by: Bayani, David
Published: (2023)

Automated Meta Prompt Engineering for Alignment with the Theory of Mind
by: Baughman, Aaron, et al.
Published: (2025)

Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind
by: Xiao, Hanqi, et al.
Published: (2026)

A Notion of Complexity for Theory of Mind via Discrete World Models
by: Huang, X. Angelo, et al.
Published: (2024)

Single layer tiny Co$^4$ outpaces GPT-2 and GPT-BERT
by: Zain, Noor Ul, et al.
Published: (2025)

Explore Theory of Mind: Program-guided adversarial data generation for theory of mind reasoning
by: Sclar, Melanie, et al.
Published: (2024)

Small LLMs Do Not Learn a Generalizable Theory of Mind via Reinforcement Learning
by: Sarangi, Sneheel, et al.
Published: (2025)

Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind
by: Ackerman, Christopher
Published: (2026)

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
by: Chen, Junying, et al.
Published: (2024)

Architectural Flaw Detection in Civil Engineering Using GPT-4
by: Kumar, Saket, et al.
Published: (2024)

ChatQA: Surpassing GPT-4 on Conversational QA and RAG
by: Liu, Zihan, et al.
Published: (2024)

ArabianGPT: Native Arabic GPT-based Large Language Model
by: Koubaa, Anis, et al.
Published: (2024)

Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs
by: Bergsma, Shane, et al.
Published: (2025)

MMToM-QA: Multimodal Theory of Mind Question Answering
by: Jin, Chuanyang, et al.
Published: (2024)

Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension
by: Vatsal, Shubham, et al.
Published: (2024)

Can We Count on LLMs? The Fixed-Effect Fallacy and Claims of GPT-4 Capabilities
by: Ball, Thomas, et al.
Published: (2024)

MediaMind: Revolutionizing Media Monitoring using Agentification
by: Gunduz, Ahmet, et al.
Published: (2025)

Mini Minds: Exploring Bebeshka and Zlata Baby Models
by: Proskurina, Irina, et al.
Published: (2023)

Benchmarking ChatGPT on Algorithmic Reasoning
by: McLeish, Sean, et al.
Published: (2024)

Agent-ToM: Learning to Monitor Autonomous LLM Agents via Theory-of-Mind Reasoning
by: Ahmed, Nesreen K., et al.
Published: (2026)

Low-Resource Languages Jailbreak GPT-4
by: Yong, Zheng-Xin, et al.
Published: (2023)

Using Hallucinations to Bypass GPT4's Filter
by: Lemkin, Benjamin
Published: (2024)

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
by: Zhao, Justin, et al.
Published: (2024)

MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
by: Shi, Haojun, et al.
Published: (2024)

Geometric-Averaged Preference Optimization for Soft Preference Labels
by: Furuta, Hiroki, et al.
Published: (2024)

Can we trust the evaluation on ChatGPT?
by: Aiyappa, Rachith, et al.
Published: (2023)

NExT-GPT: Any-to-Any Multimodal LLM
by: Wu, Shengqiong, et al.
Published: (2023)

Universal Neurons in GPT2 Language Models
by: Gurnee, Wes, et al.
Published: (2024)

HumanEval on Latest GPT Models -- 2024
by: Li, Daniel, et al.
Published: (2024)

Fairness of ChatGPT
by: Li, Yunqi, et al.
Published: (2023)

Evaluating GPT's Capability in Identifying Stages of Cognitive Impairment from Electronic Health Data
by: Leng, Yu, et al.
Published: (2025)

Mind the Gap: A Review of Arabic Post-Training Datasets and Their Limitations
by: Alkhowaiter, Mohammed, et al.
Published: (2025)

Mechanistic Interpretability of GPT-like Models on Summarization Tasks
by: Mishra, Anurag
Published: (2025)

If in a Crowdsourced Data Annotation Pipeline, a GPT-4
by: He, Zeyu, et al.
Published: (2024)

Meta-GPT: Decoding the Metasurface Genome with Generative Artificial Intelligence
by: Dang, David, et al.
Published: (2025)

How Prevalent is Gender Bias in ChatGPT? -- Exploring German and English ChatGPT Responses
by: Urchs, Stefanie, et al.
Published: (2023)

MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters
by: Dada, Amin, et al.
Published: (2025)

Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment
by: Zhang, Jiazheng, et al.
Published: (2025)

PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners
by: Xiao, Yijia, et al.
Published: (2023)

Enhancing Antibiotic Stewardship using a Natural Language Approach for Better Feature Representation
by: Lee, Simon A., et al.
Published: (2024)