Saved in:
Bibliographic Details
Main Authors: Wang, Yuchen, Wang, Haonan, Guo, Yu, Yang, Honglong, Li, Xiaomeng
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.03312
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913174594256896
author Wang, Yuchen
Wang, Haonan
Guo, Yu
Yang, Honglong
Li, Xiaomeng
author_facet Wang, Yuchen
Wang, Haonan
Guo, Yu
Yang, Honglong
Li, Xiaomeng
contents Decoding natural language from non-invasive EEG signals is a promising yet challenging task. However, current state-of-the-art models remain constrained by three fundamental issues: Semantic Bias, where outputs collapse into generic linguistic templates; Signal Neglect, where models rely heavily on LLM priors to hallucinate fluent text even in the absence of meaningful signals; and the "BLEU Trap", where high-frequency stopwords inflate n-gram metrics, masking a lack of true semantic fidelity. To resolve these challenges, we move beyond conventional end-to-end pipelines and propose SemKey, a novel multi-stage framework that enforces signal-grounded generation through four decoupled semantic objectives: sentiment, topic, length, and surprisal. We extract these semantic anchors from EEG embeddings directly, then unify them with an Active Retrieval Decoding mechanism, compelling the LLM to ground its token generation in the neural signals rather than defaulting to linguistic priors. Furthermore, we break the BLEU Trap by establishing a comprehensive evaluation protocol using rigorous retrieval and distribution-based metrics such as Fréchet Distance. Extensive experiments demonstrate that SemKey effectively mitigates hallucinations on noise inputs and achieves SOTA performance on these robust protocols. Code will be released upon acceptance at https://github.com/xmed-lab/SemKey.
format Preprint
id arxiv_https___arxiv_org_abs_2603_03312
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding
Wang, Yuchen
Wang, Haonan
Guo, Yu
Yang, Honglong
Li, Xiaomeng
Computation and Language
Artificial Intelligence
Human-Computer Interaction
Audio and Speech Processing
Neurons and Cognition
Decoding natural language from non-invasive EEG signals is a promising yet challenging task. However, current state-of-the-art models remain constrained by three fundamental issues: Semantic Bias, where outputs collapse into generic linguistic templates; Signal Neglect, where models rely heavily on LLM priors to hallucinate fluent text even in the absence of meaningful signals; and the "BLEU Trap", where high-frequency stopwords inflate n-gram metrics, masking a lack of true semantic fidelity. To resolve these challenges, we move beyond conventional end-to-end pipelines and propose SemKey, a novel multi-stage framework that enforces signal-grounded generation through four decoupled semantic objectives: sentiment, topic, length, and surprisal. We extract these semantic anchors from EEG embeddings directly, then unify them with an Active Retrieval Decoding mechanism, compelling the LLM to ground its token generation in the neural signals rather than defaulting to linguistic priors. Furthermore, we break the BLEU Trap by establishing a comprehensive evaluation protocol using rigorous retrieval and distribution-based metrics such as Fréchet Distance. Extensive experiments demonstrate that SemKey effectively mitigates hallucinations on noise inputs and achieves SOTA performance on these robust protocols. Code will be released upon acceptance at https://github.com/xmed-lab/SemKey.
title Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding
topic Computation and Language
Artificial Intelligence
Human-Computer Interaction
Audio and Speech Processing
Neurons and Cognition
url https://arxiv.org/abs/2603.03312