Saved in:
| Main Authors: | Yuan, Yifei, Søgaard, Anders |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.04421 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluation Revisited: A Taxonomy of Evaluation Concerns in Natural Language Processing
by: Dhar, Ruchira, et al.
Published: (2026)
by: Dhar, Ruchira, et al.
Published: (2026)
Unlocking Markets: A Multilingual Benchmark to Cross-Market Question Answering
by: Yuan, Yifei, et al.
Published: (2024)
by: Yuan, Yifei, et al.
Published: (2024)
What if Othello-Playing Language Models Could See?
by: Chen, Xinyi, et al.
Published: (2025)
by: Chen, Xinyi, et al.
Published: (2025)
From Words to Worlds: Compositionality for Cognitive Architectures
by: Dhar, Ruchira, et al.
Published: (2024)
by: Dhar, Ruchira, et al.
Published: (2024)
Factual Consistency of Multilingual Pretrained Language Models
by: Fierro, Constanza, et al.
Published: (2022)
by: Fierro, Constanza, et al.
Published: (2022)
Word Order and World Knowledge
by: Zhao, Qinghua, et al.
Published: (2024)
by: Zhao, Qinghua, et al.
Published: (2024)
Concept Space Alignment in Multilingual LLMs
by: Peng, Qiwei, et al.
Published: (2024)
by: Peng, Qiwei, et al.
Published: (2024)
Understanding Subword Compositionality of Large Language Models
by: Peng, Qiwei, et al.
Published: (2025)
by: Peng, Qiwei, et al.
Published: (2025)
Does Instruction Tuning Make LLMs More Consistent?
by: Fierro, Constanza, et al.
Published: (2024)
by: Fierro, Constanza, et al.
Published: (2024)
How Do Multilingual Language Models Remember Facts?
by: Fierro, Constanza, et al.
Published: (2024)
by: Fierro, Constanza, et al.
Published: (2024)
Evaluating Adjective-Noun Compositionality in LLMs: Functional vs Representational Perspectives
by: Dhar, Ruchira, et al.
Published: (2026)
by: Dhar, Ruchira, et al.
Published: (2026)
A Discriminative Latent-Variable Model for Bilingual Lexicon Induction
by: Ruder, Sebastian, et al.
Published: (2018)
by: Ruder, Sebastian, et al.
Published: (2018)
Trick or Neat: Adversarial Ambiguity and Language Model Evaluation
by: Karamolegkou, Antonia, et al.
Published: (2025)
by: Karamolegkou, Antonia, et al.
Published: (2025)
Defining Knowledge: Bridging Epistemology and Large Language Models
by: Fierro, Constanza, et al.
Published: (2024)
by: Fierro, Constanza, et al.
Published: (2024)
MuLan: A Study of Fact Mutability in Language Models
by: Fierro, Constanza, et al.
Published: (2024)
by: Fierro, Constanza, et al.
Published: (2024)
Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A Multifaceted Statistical Approach
by: Sun, Kun, et al.
Published: (2024)
by: Sun, Kun, et al.
Published: (2024)
WebQAmGaze: A Multilingual Webcam Eye-Tracking-While-Reading Dataset
by: Ribeiro, Tiago, et al.
Published: (2023)
by: Ribeiro, Tiago, et al.
Published: (2023)
Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
by: Li, Jiaang, et al.
Published: (2023)
by: Li, Jiaang, et al.
Published: (2023)
Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale Annotations
by: Brandl, Stephanie, et al.
Published: (2024)
by: Brandl, Stephanie, et al.
Published: (2024)
Revisiting the Superficial Alignment Hypothesis
by: Raghavendra, Mohit, et al.
Published: (2024)
by: Raghavendra, Mohit, et al.
Published: (2024)
Lost in Embeddings: Information Loss in Vision-Language Models
by: Li, Wenyan, et al.
Published: (2025)
by: Li, Wenyan, et al.
Published: (2025)
Debiasing Multilingual LLMs in Cross-lingual Latent Space
by: Peng, Qiwei, et al.
Published: (2025)
by: Peng, Qiwei, et al.
Published: (2025)
mOthello: When Do Cross-Lingual Representation Alignment and Cross-Lingual Transfer Emerge in Multilingual Models?
by: Hua, Tianze, et al.
Published: (2024)
by: Hua, Tianze, et al.
Published: (2024)
Revisiting the UID Hypothesis in LLM Reasoning Traces
by: Gwak, Minju, et al.
Published: (2025)
by: Gwak, Minju, et al.
Published: (2025)
MEG: Medical Knowledge-Augmented Large Language Models for Question Answering
by: Cabello, Laura, et al.
Published: (2024)
by: Cabello, Laura, et al.
Published: (2024)
Vision-Language Models under Cultural and Inclusive Considerations
by: Karamolegkou, Antonia, et al.
Published: (2024)
by: Karamolegkou, Antonia, et al.
Published: (2024)
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture
by: Li, Wenyan, et al.
Published: (2024)
by: Li, Wenyan, et al.
Published: (2024)
RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding
by: Li, Jiaang, et al.
Published: (2025)
by: Li, Jiaang, et al.
Published: (2025)
Revisiting the Uniform Information Density Hypothesis in LLM Reasoning
by: Gwak, Minju, et al.
Published: (2025)
by: Gwak, Minju, et al.
Published: (2025)
Towards Fair ASR For Second Language Speakers Using Fairness Prompted Finetuning
by: Swain, Monorama, et al.
Published: (2025)
by: Swain, Monorama, et al.
Published: (2025)
Ethical Concern Identification in NLP: A Corpus of ACL Anthology Ethics Statements
by: Karamolegkou, Antonia, et al.
Published: (2024)
by: Karamolegkou, Antonia, et al.
Published: (2024)
LiRA: A Multi-Agent Framework for Reliable and Readable Literature Review Generation
by: Go, Gregory Hok Tjoan, et al.
Published: (2025)
by: Go, Gregory Hok Tjoan, et al.
Published: (2025)
The Open-World Lottery Ticket Hypothesis for OOD Intent Classification
by: Zhou, Yunhua, et al.
Published: (2022)
by: Zhou, Yunhua, et al.
Published: (2022)
Lost at the Beginning of Reasoning
by: Liao, Baohao, et al.
Published: (2025)
by: Liao, Baohao, et al.
Published: (2025)
EvalCards: A Framework for Standardized Evaluation Reporting
by: Dhar, Ruchira, et al.
Published: (2025)
by: Dhar, Ruchira, et al.
Published: (2025)
Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users
by: Karamolegkou, Antonia, et al.
Published: (2025)
by: Karamolegkou, Antonia, et al.
Published: (2025)
The Cylindrical Representation Hypothesis for Language Model Steering
by: Gao, Lang, et al.
Published: (2026)
by: Gao, Lang, et al.
Published: (2026)
Revisiting the Graph Reasoning Ability of Large Language Models: Case Studies in Translation, Connectivity and Shortest Path
by: Dai, Xinnan, et al.
Published: (2024)
by: Dai, Xinnan, et al.
Published: (2024)
Statistical Hypothesis Testing for Auditing Robustness in Language Models
by: Rauba, Paulius, et al.
Published: (2025)
by: Rauba, Paulius, et al.
Published: (2025)
Mechanistic Interpretability Needs Philosophy
by: Williams, Iwan, et al.
Published: (2025)
by: Williams, Iwan, et al.
Published: (2025)
Similar Items
-
Evaluation Revisited: A Taxonomy of Evaluation Concerns in Natural Language Processing
by: Dhar, Ruchira, et al.
Published: (2026) -
Unlocking Markets: A Multilingual Benchmark to Cross-Market Question Answering
by: Yuan, Yifei, et al.
Published: (2024) -
What if Othello-Playing Language Models Could See?
by: Chen, Xinyi, et al.
Published: (2025) -
From Words to Worlds: Compositionality for Cognitive Architectures
by: Dhar, Ruchira, et al.
Published: (2024) -
Factual Consistency of Multilingual Pretrained Language Models
by: Fierro, Constanza, et al.
Published: (2022)