Saved in:
| Main Authors: | Leung, Haun, Wang, ZiNan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.13475 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Unified Pix Token And Word Token Generative Language Model
by: Leung, Haun, et al.
Published: (2026)
by: Leung, Haun, et al.
Published: (2026)
Scaling up the think-aloud method
by: Wurgaft, Daniel, et al.
Published: (2025)
by: Wurgaft, Daniel, et al.
Published: (2025)
Do not think about pink elephant!
by: Hwang, Kyomin, et al.
Published: (2024)
by: Hwang, Kyomin, et al.
Published: (2024)
Position: The Most Expensive Part of an LLM should be its Training Data
by: Kandpal, Nikhil, et al.
Published: (2025)
by: Kandpal, Nikhil, et al.
Published: (2025)
People will agree what I think: Investigating LLM's False Consensus Effect
by: Choi, Junhyuk, et al.
Published: (2024)
by: Choi, Junhyuk, et al.
Published: (2024)
<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs
by: Pletenev, Sergey, et al.
Published: (2025)
by: Pletenev, Sergey, et al.
Published: (2025)
Which symbol grounding problem should we try to solve?
by: Müller, Vincent C.
Published: (2025)
by: Müller, Vincent C.
Published: (2025)
SurveyEval: Towards Comprehensive Evaluation of LLM-Generated Academic Surveys
by: Zhao, Jiahao, et al.
Published: (2025)
by: Zhao, Jiahao, et al.
Published: (2025)
Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems
by: Min, Yingqian, et al.
Published: (2024)
by: Min, Yingqian, et al.
Published: (2024)
Referential ambiguity and clarification requests: comparing human and LLM behaviour
by: Madge, Chris, et al.
Published: (2025)
by: Madge, Chris, et al.
Published: (2025)
How Reliable are LLMs as Knowledge Bases? Re-thinking Facutality and Consistency
by: Zheng, Danna, et al.
Published: (2024)
by: Zheng, Danna, et al.
Published: (2024)
Alleviating Choice Supportive Bias in LLM with Reasoning Dependency Generation
by: Zhuang, Nan, et al.
Published: (2025)
by: Zhuang, Nan, et al.
Published: (2025)
Pay What LLM Wants: Can LLM Simulate Economics Experiment with 522 Real-human Persona?
by: Choi, Junhyuk, et al.
Published: (2025)
by: Choi, Junhyuk, et al.
Published: (2025)
LLM The Genius Paradox: A Linguistic and Math Expert's Struggle with Simple Word-based Counting Problems
by: Xu, Nan, et al.
Published: (2024)
by: Xu, Nan, et al.
Published: (2024)
DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS
by: Wang, Yingxu, et al.
Published: (2025)
by: Wang, Yingxu, et al.
Published: (2025)
Giving AI a voice: how does AI think it should be treated?
by: Fay, Maria, et al.
Published: (2025)
by: Fay, Maria, et al.
Published: (2025)
Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?
by: Nan, Yang, et al.
Published: (2025)
by: Nan, Yang, et al.
Published: (2025)
Two Pathways to Truthfulness: On the Intrinsic Encoding of LLM Hallucinations
by: Luo, Wen, et al.
Published: (2026)
by: Luo, Wen, et al.
Published: (2026)
StyleBench: Evaluating thinking styles in Large Language Models
by: Guo, Junyu, et al.
Published: (2025)
by: Guo, Junyu, et al.
Published: (2025)
Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression
by: Peng, Jingyu, et al.
Published: (2025)
by: Peng, Jingyu, et al.
Published: (2025)
Towards Goal-oriented Prompt Engineering for Large Language Models: A Survey
by: Li, Haochen, et al.
Published: (2024)
by: Li, Haochen, et al.
Published: (2024)
Stylometry recognizes human and LLM-generated texts in short samples
by: Przystalski, Karol, et al.
Published: (2025)
by: Przystalski, Karol, et al.
Published: (2025)
System 2 thinking in OpenAI's o1-preview model: Near-perfect performance on a mathematics exam
by: de Winter, Joost, et al.
Published: (2024)
by: de Winter, Joost, et al.
Published: (2024)
GameArena: Evaluating LLM Reasoning through Live Computer Games
by: Hu, Lanxiang, et al.
Published: (2024)
by: Hu, Lanxiang, et al.
Published: (2024)
LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples
by: Yao, Jia-Yu, et al.
Published: (2023)
by: Yao, Jia-Yu, et al.
Published: (2023)
Beyond Linear LLM Invocation: An Efficient and Effective Semantic Filter Paradigm
by: Hou, Nan, et al.
Published: (2026)
by: Hou, Nan, et al.
Published: (2026)
EvoP: Robust LLM Inference via Evolutionary Pruning
by: Wu, Shangyu, et al.
Published: (2025)
by: Wu, Shangyu, et al.
Published: (2025)
Are they human? Detecting large language models by probing human memory constraints
by: Schug, Simon, et al.
Published: (2026)
by: Schug, Simon, et al.
Published: (2026)
LLM-Rec: Personalized Recommendation via Prompting Large Language Models
by: Lyu, Hanjia, et al.
Published: (2023)
by: Lyu, Hanjia, et al.
Published: (2023)
Competition-Level Problems are Effective LLM Evaluators
by: Huang, Yiming, et al.
Published: (2023)
by: Huang, Yiming, et al.
Published: (2023)
Increasing faithfulness in human-human dialog summarization with Spoken Language Understanding tasks
by: Akani, Eunice, et al.
Published: (2024)
by: Akani, Eunice, et al.
Published: (2024)
Federated In-Context LLM Agent Learning
by: Wu, Panlong, et al.
Published: (2024)
by: Wu, Panlong, et al.
Published: (2024)
Micro-Act: Mitigating Knowledge Conflict in LLM-based RAG via Actionable Self-Reasoning
by: Huo, Nan, et al.
Published: (2025)
by: Huo, Nan, et al.
Published: (2025)
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning
by: Wan, Ziyu, et al.
Published: (2025)
by: Wan, Ziyu, et al.
Published: (2025)
The Integration of Semantic and Structural Knowledge in Knowledge Graph Entity Typing
by: Li, Muzhi, et al.
Published: (2024)
by: Li, Muzhi, et al.
Published: (2024)
When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models
by: Amouyal, Samuel Joseph, et al.
Published: (2025)
by: Amouyal, Samuel Joseph, et al.
Published: (2025)
LLM-based MOFs Synthesis Condition Extraction using Few-Shot Demonstrations
by: Shi, Lei, et al.
Published: (2024)
by: Shi, Lei, et al.
Published: (2024)
Way to Specialist: Closing Loop Between Specialized LLM and Evolving Domain Knowledge Graph
by: Zhang, Yutong, et al.
Published: (2024)
by: Zhang, Yutong, et al.
Published: (2024)
LiveFact: A Dynamic, Time-Aware Benchmark for LLM-Driven Fake News Detection
by: Xu, Cheng, et al.
Published: (2026)
by: Xu, Cheng, et al.
Published: (2026)
Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA
by: Gor, Maharshi, et al.
Published: (2024)
by: Gor, Maharshi, et al.
Published: (2024)
Similar Items
-
Unified Pix Token And Word Token Generative Language Model
by: Leung, Haun, et al.
Published: (2026) -
Scaling up the think-aloud method
by: Wurgaft, Daniel, et al.
Published: (2025) -
Do not think about pink elephant!
by: Hwang, Kyomin, et al.
Published: (2024) -
Position: The Most Expensive Part of an LLM should be its Training Data
by: Kandpal, Nikhil, et al.
Published: (2025) -
People will agree what I think: Investigating LLM's False Consensus Effect
by: Choi, Junhyuk, et al.
Published: (2024)