Saved in:
| Main Authors: | Fu, Tairan, Martínez, Gonzalo, Conde, Javier, Arriaga, Carlos, Reviriego, Pedro, Qi, Xiuyuan, Liu, Shanshan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.06118 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident, Especially When They are Wrong
by: Fu, Tairan, et al.
Published: (2025)
by: Fu, Tairan, et al.
Published: (2025)
The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations
by: Arriaga, Carlos, et al.
Published: (2025)
by: Arriaga, Carlos, et al.
Published: (2025)
Is There a Case for Conversation Optimized Tokenizers in Large Language Models?
by: Ferrando, Raquel, et al.
Published: (2025)
by: Ferrando, Raquel, et al.
Published: (2025)
Lost in Sampling: Assessing Lexical Reachability in LLMs via the Word Coverage Score (WCS)
by: Awad, Samer, et al.
Published: (2026)
by: Awad, Samer, et al.
Published: (2026)
Speed and Conversational Large Language Models: Not All Is About Tokens per Second
by: Conde, Javier, et al.
Published: (2025)
by: Conde, Javier, et al.
Published: (2025)
Why Do Large Language Models (LLMs) Struggle to Count Letters?
by: Fu, Tairan, et al.
Published: (2024)
by: Fu, Tairan, et al.
Published: (2024)
Can ChatGPT Learn to Count Letters?
by: Conde, Javier, et al.
Published: (2025)
by: Conde, Javier, et al.
Published: (2025)
Stochastic Streets: A Walk Through Random LLM Address Generation in four European Cities
by: Fu, Tairan, et al.
Published: (2025)
by: Fu, Tairan, et al.
Published: (2025)
To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times
by: Clark, Thomas Hikaru, et al.
Published: (2026)
by: Clark, Thomas Hikaru, et al.
Published: (2026)
Concurrent Linguistic Error Detection (CLED): a New Methodology for Error Detection in Large Language Models
by: Zhu, Jinhua, et al.
Published: (2024)
by: Zhu, Jinhua, et al.
Published: (2024)
Playing with words: Comparing the vocabulary and lexical diversity of ChatGPT and humans
by: Reviriego, Pedro, et al.
Published: (2023)
by: Reviriego, Pedro, et al.
Published: (2023)
Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans
by: Conde, Javier, et al.
Published: (2025)
by: Conde, Javier, et al.
Published: (2025)
Understanding the Impact of Artificial Intelligence in Academic Writing: Metadata to the Rescue
by: Conde, Javier, et al.
Published: (2025)
by: Conde, Javier, et al.
Published: (2025)
Have Multimodal Large Language Models (MLLMs) Really Learned to Tell the Time on Analog Clocks?
by: Fu, Tairan, et al.
Published: (2025)
by: Fu, Tairan, et al.
Published: (2025)
How does fine-tuning improve sensorimotor representations in large language models?
by: Wu, Minghua, et al.
Published: (2026)
by: Wu, Minghua, et al.
Published: (2026)
Energy-Efficient Stochastic Computing (SC) Neural Networks for Internet of Things Devices With Layer-Wise Adjustable Sequence Length (ASL)
by: Wang, Ziheng, et al.
Published: (2025)
by: Wang, Ziheng, et al.
Published: (2025)
Large Language Models and Book Summarization: Reading or Remembering, Which Is Better?
by: Fu, Tairan, et al.
Published: (2026)
by: Fu, Tairan, et al.
Published: (2026)
Lost in the Vibrations: Vision Language Models Fail the Dynamic Gauges Test
by: Fu, Tairan, et al.
Published: (2026)
by: Fu, Tairan, et al.
Published: (2026)
Real-time Spatial Retrieval Augmented Generation for Urban Environments
by: Campo, David Nazareno, et al.
Published: (2025)
by: Campo, David Nazareno, et al.
Published: (2025)
Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings
by: Conde, Javier, et al.
Published: (2025)
by: Conde, Javier, et al.
Published: (2025)
Assessing Latency in ASR Systems: A Methodological Perspective for Real-Time Use
by: Arriaga, Carlos, et al.
Published: (2024)
by: Arriaga, Carlos, et al.
Published: (2024)
Spanish and LLM Benchmarks: is MMLU Lost in Translation?
by: Plaza, Irene, et al.
Published: (2024)
by: Plaza, Irene, et al.
Published: (2024)
DiFR: Inference Verification Despite Nondeterminism
by: Karvonen, Adam, et al.
Published: (2025)
by: Karvonen, Adam, et al.
Published: (2025)
Artificial Intelligence and Misinformation in Art: Can Vision Language Models Judge the Hand or the Machine Behind the Canvas?
by: Fu, Tarian, et al.
Published: (2025)
by: Fu, Tarian, et al.
Published: (2025)
Optimistic Verifiable Training by Controlling Hardware Nondeterminism
by: Srivastava, Megha, et al.
Published: (2024)
by: Srivastava, Megha, et al.
Published: (2024)
Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
by: Jiang, Peihai, et al.
Published: (2025)
by: Jiang, Peihai, et al.
Published: (2025)
Evaluating SAT and SMT Solvers on Large-Scale Sudoku Puzzles
by: Davis, Liam, et al.
Published: (2025)
by: Davis, Liam, et al.
Published: (2025)
Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach
by: Quevedo, Ernesto, et al.
Published: (2024)
by: Quevedo, Ernesto, et al.
Published: (2024)
Analyzing Recursiveness in Multimodal Generative Artificial Intelligence: Stability or Divergence?
by: Conde, Javier, et al.
Published: (2024)
by: Conde, Javier, et al.
Published: (2024)
BrainBench: Exposing the Commonsense Reasoning Gap in Large Language Models
by: Tang, Yuzhe
Published: (2026)
by: Tang, Yuzhe
Published: (2026)
Distributed Specialization: Rare-Token Neurons in Large Language Models
by: Liu, Jing, et al.
Published: (2025)
by: Liu, Jing, et al.
Published: (2025)
Calibrating Verbalized Probabilities for Large Language Models
by: Wang, Cheng, et al.
Published: (2024)
by: Wang, Cheng, et al.
Published: (2024)
Incoherent Probability Judgments in Large Language Models
by: Zhu, Jian-Qiao, et al.
Published: (2024)
by: Zhu, Jian-Qiao, et al.
Published: (2024)
findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding
by: Martínez, Héctor Javier Vázquez
Published: (2026)
by: Martínez, Héctor Javier Vázquez
Published: (2026)
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
Reproducibility Study of "XRec: Large Language Models for Explainable Recommendation"
by: Mishra, Ranjan, et al.
Published: (2025)
by: Mishra, Ranjan, et al.
Published: (2025)
Establishing Vocabulary Tests as a Benchmark for Evaluating Large Language Models
by: Martínez, Gonzalo, et al.
Published: (2023)
by: Martínez, Gonzalo, et al.
Published: (2023)
Reproducing and Extending Experiments in Behavioral Strategy with Large Language Models
by: Albert, Daniel, et al.
Published: (2024)
by: Albert, Daniel, et al.
Published: (2024)
Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models
by: Liu, Peijie, et al.
Published: (2025)
by: Liu, Peijie, et al.
Published: (2025)
HInter: Exposing Hidden Intersectional Bias in Large Language Models
by: Souani, Badr, et al.
Published: (2025)
by: Souani, Badr, et al.
Published: (2025)
Similar Items
-
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident, Especially When They are Wrong
by: Fu, Tairan, et al.
Published: (2025) -
The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations
by: Arriaga, Carlos, et al.
Published: (2025) -
Is There a Case for Conversation Optimized Tokenizers in Large Language Models?
by: Ferrando, Raquel, et al.
Published: (2025) -
Lost in Sampling: Assessing Lexical Reachability in LLMs via the Word Coverage Score (WCS)
by: Awad, Samer, et al.
Published: (2026) -
Speed and Conversational Large Language Models: Not All Is About Tokens per Second
by: Conde, Javier, et al.
Published: (2025)