:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fu, Tairan, Ferrando, Raquel, Conde, Javier, Arriaga, Carlos, Reviriego, Pedro
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2412.18626
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Is There a Case for Conversation Optimized Tokenizers in Large Language Models?
by: Ferrando, Raquel, et al.
Published: (2025)

Lost in Sampling: Assessing Lexical Reachability in LLMs via the Word Coverage Score (WCS)
by: Awad, Samer, et al.
Published: (2026)

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident, Especially When They are Wrong
by: Fu, Tairan, et al.
Published: (2025)

Have Multimodal Large Language Models (MLLMs) Really Learned to Tell the Time on Analog Clocks?
by: Fu, Tairan, et al.
Published: (2025)

Large Language Models and Book Summarization: Reading or Remembering, Which Is Better?
by: Fu, Tairan, et al.
Published: (2026)

The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations
by: Arriaga, Carlos, et al.
Published: (2025)

To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times
by: Clark, Thomas Hikaru, et al.
Published: (2026)

Can ChatGPT Learn to Count Letters?
by: Conde, Javier, et al.
Published: (2025)

Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings
by: Conde, Javier, et al.
Published: (2025)

Open Conversational LLMs do not know most Spanish words
by: Conde, Javier, et al.
Published: (2024)

Beyond Reproducibility: Token Probabilities Expose Large Language Model Nondeterminism
by: Fu, Tairan, et al.
Published: (2026)

Lost in the Vibrations: Vision Language Models Fail the Dynamic Gauges Test
by: Fu, Tairan, et al.
Published: (2026)

Speed and Conversational Large Language Models: Not All Is About Tokens per Second
by: Conde, Javier, et al.
Published: (2025)

Evaluating Large Language Models with Tests of Spanish as a Foreign Language: Pass or Fail?
by: Mayor-Rocher, Marina, et al.
Published: (2024)

Beware of Words: Evaluating the Lexical Diversity of Conversational LLMs using ChatGPT as Case Study
by: Martínez, Gonzalo, et al.
Published: (2024)

Concurrent Linguistic Error Detection (CLED): a New Methodology for Error Detection in Large Language Models
by: Zhu, Jinhua, et al.
Published: (2024)

How does fine-tuning improve sensorimotor representations in large language models?
by: Wu, Minghua, et al.
Published: (2026)

Establishing Vocabulary Tests as a Benchmark for Evaluating Large Language Models
by: Martínez, Gonzalo, et al.
Published: (2023)

Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans
by: Conde, Javier, et al.
Published: (2025)

It's the same but not the same: Do LLMs distinguish Spanish varieties?
by: Mayor-Rocher, Marina, et al.
Published: (2025)

Using large language models to estimate features of multi-word expressions: Concreteness, valence, arousal
by: Martínez, Gonzalo, et al.
Published: (2024)

Playing with words: Comparing the vocabulary and lexical diversity of ChatGPT and humans
by: Reviriego, Pedro, et al.
Published: (2023)

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
by: Ferrando, Javier, et al.
Published: (2024)

Why Do LLMs Struggle in Strategic Play? Broken Links Between Observations, Beliefs, and Actions
by: Sobotka, Jan, et al.
Published: (2026)

Information Flow Routes: Automatically Interpreting Language Models at Scale
by: Ferrando, Javier, et al.
Published: (2024)

Assessing Latency in ASR Systems: A Methodological Perspective for Real-Time Use
by: Arriaga, Carlos, et al.
Published: (2024)

Automated Interpretability and Feature Discovery in Language Models with Agents
by: Marin-Llobet, Arnau, et al.
Published: (2026)

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
by: Zhu, Yuqi, et al.
Published: (2025)

Spanish and LLM Benchmarks: is MMLU Lost in Translation?
by: Plaza, Irene, et al.
Published: (2024)

Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?
by: Li, Pengxiang, et al.
Published: (2026)

Large Language Models Struggle with Unreasonability in Math Problems
by: Ma, Jingyuan, et al.
Published: (2024)

Why Do Self-Harm Prediction Models Struggle to Generalise? Lexical and Semantic Variations in Emergency Department Triage Notes
by: Chen, Liuliu, et al.
Published: (2026)

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
by: Tufanov, Igor, et al.
Published: (2024)

Stochastic Streets: A Walk Through Random LLM Address Generation in four European Cities
by: Fu, Tairan, et al.
Published: (2025)

On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
by: Ferrando, Javier, et al.
Published: (2024)

Artificial Intelligence and Misinformation in Art: Can Vision Language Models Judge the Hand or the Machine Behind the Canvas?
by: Fu, Tarian, et al.
Published: (2025)

Gender Inequality in English Textbooks Around the World: an NLP Approach
by: Liu, Tairan
Published: (2025)

A Primer on the Inner Workings of Transformer-based Language Models
by: Ferrando, Javier, et al.
Published: (2024)

Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding
by: Huang, Kung-Hsiang, et al.
Published: (2025)

The LLM Bottleneck: Why Open-Source Vision LLMs Struggle with Hierarchical Visual Recognition
by: Tan, Yuwen, et al.
Published: (2025)