Gespeichert in:
| Hauptverfasser: | , , |
|---|---|
| Format: | Preprint |
| Veröffentlicht: |
2026
|
| Schlagworte: | |
| Online-Zugang: | https://arxiv.org/abs/2605.04857 |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| _version_ | 1866915983970533376 |
|---|---|
| author | Santos, Eduardo Carvalho, Juliana Rennó-Costa, César |
| author_facet | Santos, Eduardo Carvalho, Juliana Rennó-Costa, César |
| contents | This paper presents the development and validation of an eye-tracking dataset designed to investigate how second-language (L2) learners process idiomatic expressions. While native speakers often rely on direct retrieval of figurative meanings, L2 speakers frequently adopt a literal-first approach, which incurs measurable cognitive costs. This resource captures these costs through ocular metrics recorded from Portuguese L1 speakers of English across all CEFR proficiency levels (A1-C2). Although the study uses entry-level 60 Hz hardware (Tobii Pro Spark), we demonstrate that this sampling rate provides sufficient data density to detect macro-cognitive events such as fixations and regressions in reading. Preliminary analysis validates the dataset by revealing a strong inverse correlation between language proficiency and regressive eye movements. Integrated into the MIA (Modeling Idiomaticity in Human and Artificial Language Processing) initiative, this dataset serves as a cognitively grounded benchmark for evaluating both human processing models and the alignment of large language models with human-like figurative understanding. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2605_04857 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Assessing Cognitive Effort in L2 Idiomatic Processing: An Eye-Tracking Dataset Santos, Eduardo Carvalho, Juliana Rennó-Costa, César Computation and Language Artificial Intelligence Computer Vision and Pattern Recognition This paper presents the development and validation of an eye-tracking dataset designed to investigate how second-language (L2) learners process idiomatic expressions. While native speakers often rely on direct retrieval of figurative meanings, L2 speakers frequently adopt a literal-first approach, which incurs measurable cognitive costs. This resource captures these costs through ocular metrics recorded from Portuguese L1 speakers of English across all CEFR proficiency levels (A1-C2). Although the study uses entry-level 60 Hz hardware (Tobii Pro Spark), we demonstrate that this sampling rate provides sufficient data density to detect macro-cognitive events such as fixations and regressions in reading. Preliminary analysis validates the dataset by revealing a strong inverse correlation between language proficiency and regressive eye movements. Integrated into the MIA (Modeling Idiomaticity in Human and Artificial Language Processing) initiative, this dataset serves as a cognitively grounded benchmark for evaluating both human processing models and the alignment of large language models with human-like figurative understanding. |
| title | Assessing Cognitive Effort in L2 Idiomatic Processing: An Eye-Tracking Dataset |
| topic | Computation and Language Artificial Intelligence Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2605.04857 |