Saved in:
| Main Authors: | Lugoloobi, William, Russell, Chris |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.18147 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations
by: Lugoloobi, William, et al.
Published: (2026)
by: Lugoloobi, William, et al.
Published: (2026)
QueST: Incentivizing LLMs to Generate Difficult Problems
by: Hu, Hanxu, et al.
Published: (2025)
by: Hu, Hanxu, et al.
Published: (2025)
Known By Their Actions: Fingerprinting LLM Browser Agents via UI Traces
by: Lugoloobi, William, et al.
Published: (2026)
by: Lugoloobi, William, et al.
Published: (2026)
Multi-use LLM Watermarking and the False Detection Problem
by: Fu, Zihao, et al.
Published: (2025)
by: Fu, Zihao, et al.
Published: (2025)
Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
by: Zhao, Sihang, et al.
Published: (2024)
by: Zhao, Sihang, et al.
Published: (2024)
ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning
by: Pei, Qizhi, et al.
Published: (2025)
by: Pei, Qizhi, et al.
Published: (2025)
SAND-Math: Using LLMs to Generate Novel, Difficult and Useful Mathematics Questions and Answers
by: Manem, Chaitanya, et al.
Published: (2025)
by: Manem, Chaitanya, et al.
Published: (2025)
Hidden in the Haystack: Smaller Needles are More Difficult for LLMs to Find
by: Bianchi, Owen, et al.
Published: (2025)
by: Bianchi, Owen, et al.
Published: (2025)
Generating Difficult-to-Translate Texts
by: Zouhar, Vilém, et al.
Published: (2025)
by: Zouhar, Vilém, et al.
Published: (2025)
Do LLMs and Humans Find the Same Questions Difficult? A Case Study on Japanese Quiz Answering
by: Sugiura, Naoya, et al.
Published: (2025)
by: Sugiura, Naoya, et al.
Published: (2025)
LLMs Encode Harmfulness and Refusal Separately
by: Zhao, Jiachen, et al.
Published: (2025)
by: Zhao, Jiachen, et al.
Published: (2025)
Difficult for Whom? A Study of Japanese Lexical Complexity
by: Nohejl, Adam, et al.
Published: (2024)
by: Nohejl, Adam, et al.
Published: (2024)
How to Encode Domain Information in Relation Classification
by: Bassignana, Elisa, et al.
Published: (2024)
by: Bassignana, Elisa, et al.
Published: (2024)
Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection
by: Thorat, Shantanu, et al.
Published: (2024)
by: Thorat, Shantanu, et al.
Published: (2024)
Task-Specific Knowledge Distillation via Intermediate Probes
by: Brown, Ryan, et al.
Published: (2026)
by: Brown, Ryan, et al.
Published: (2026)
Long Is More Important Than Difficult for Training Reasoning Models
by: Shen, Si, et al.
Published: (2025)
by: Shen, Si, et al.
Published: (2025)
How Do LLMs Perform Two-Hop Reasoning in Context?
by: Guo, Tianyu, et al.
Published: (2025)
by: Guo, Tianyu, et al.
Published: (2025)
An Encoding for CLP Problems in SMT-LIB
by: Amrollahi, Daneshvar, et al.
Published: (2024)
by: Amrollahi, Daneshvar, et al.
Published: (2024)
It's Difficult to be Neutral -- Human and LLM-based Sentiment Annotation of Patient Comments
by: Mæhlum, Petter, et al.
Published: (2024)
by: Mæhlum, Petter, et al.
Published: (2024)
Learning-Time Encoding Shapes Unlearning in LLMs
by: Wu, Ruihan, et al.
Published: (2025)
by: Wu, Ruihan, et al.
Published: (2025)
NumeroLogic: Number Encoding for Enhanced LLMs' Numerical Reasoning
by: Schwartz, Eli, et al.
Published: (2024)
by: Schwartz, Eli, et al.
Published: (2024)
Sakura at BEA 2026 Shared Task 1: What Makes Vocabulary Difficult?
by: Nohejl, Adam, et al.
Published: (2026)
by: Nohejl, Adam, et al.
Published: (2026)
Do LLMs Encode Frame Semantics? Evidence from Frame Identification
by: Chundru, Jayanth Krishna, et al.
Published: (2025)
by: Chundru, Jayanth Krishna, et al.
Published: (2025)
Entropy-Driven Pre-Tokenization for Byte-Pair Encoding
by: Hu, Yifan, et al.
Published: (2025)
by: Hu, Yifan, et al.
Published: (2025)
Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?
by: Hase, Peter, et al.
Published: (2024)
by: Hase, Peter, et al.
Published: (2024)
How Do LLMs Encode Scientific Quality? An Empirical Study Using Monosemantic Features from Sparse Autoencoders
by: McCoubrey, Michael, et al.
Published: (2026)
by: McCoubrey, Michael, et al.
Published: (2026)
Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge
by: Fu, Jinlan, et al.
Published: (2024)
by: Fu, Jinlan, et al.
Published: (2024)
From Early Encoding to Late Suppression: Interpreting LLMs on Character Counting Tasks
by: Datta, Ayan, et al.
Published: (2026)
by: Datta, Ayan, et al.
Published: (2026)
Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons
by: Zhou, Shijia, et al.
Published: (2024)
by: Zhou, Shijia, et al.
Published: (2024)
Are You Doubtful? Oh, It Might Be Difficult Then! Exploring the Use of Model Uncertainty for Question Difficulty Estimation
by: Zotos, Leonidas, et al.
Published: (2024)
by: Zotos, Leonidas, et al.
Published: (2024)
Circuit Fingerprints: How Answer Tokens Encode Their Geometrical Path
by: Saurez, Andres, et al.
Published: (2026)
by: Saurez, Andres, et al.
Published: (2026)
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
by: Gao, Chengqian, et al.
Published: (2025)
by: Gao, Chengqian, et al.
Published: (2025)
CoDiQ: Test-Time Scaling for Controllable Difficult Question Generation
by: Peng, Zhongyuan, et al.
Published: (2026)
by: Peng, Zhongyuan, et al.
Published: (2026)
Do LLMs Encode Functional Importance of Reasoning Tokens?
by: Singh, Janvijay, et al.
Published: (2026)
by: Singh, Janvijay, et al.
Published: (2026)
Harmful Prompt Laundering: Jailbreaking LLMs with Abductive Styles and Symbolic Encoding
by: Joo, Seongho, et al.
Published: (2025)
by: Joo, Seongho, et al.
Published: (2025)
Detecting the Clinical Features of Difficult-to-Treat Depression using Synthetic Data from Large Language Models
by: Lorge, Isabelle, et al.
Published: (2024)
by: Lorge, Isabelle, et al.
Published: (2024)
Data Quality Enhancement on the Basis of Diversity with Large Language Models for Text Classification: Uncovered, Difficult, and Noisy
by: Zeng, Min, et al.
Published: (2024)
by: Zeng, Min, et al.
Published: (2024)
Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs
by: Yan, Lecheng, et al.
Published: (2026)
by: Yan, Lecheng, et al.
Published: (2026)
Outlier Dimensions Encode Task-Specific Knowledge
by: Rudman, William, et al.
Published: (2023)
by: Rudman, William, et al.
Published: (2023)
How susceptible are LLMs to Logical Fallacies?
by: Payandeh, Amirreza, et al.
Published: (2023)
by: Payandeh, Amirreza, et al.
Published: (2023)
Similar Items
-
LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations
by: Lugoloobi, William, et al.
Published: (2026) -
QueST: Incentivizing LLMs to Generate Difficult Problems
by: Hu, Hanxu, et al.
Published: (2025) -
Known By Their Actions: Fingerprinting LLM Browser Agents via UI Traces
by: Lugoloobi, William, et al.
Published: (2026) -
Multi-use LLM Watermarking and the False Detection Problem
by: Fu, Zihao, et al.
Published: (2025) -
Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
by: Zhao, Sihang, et al.
Published: (2024)