Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.03312 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866913174594256896 |
|---|---|
| author | Wang, Yuchen Wang, Haonan Guo, Yu Yang, Honglong Li, Xiaomeng |
| author_facet | Wang, Yuchen Wang, Haonan Guo, Yu Yang, Honglong Li, Xiaomeng |
| contents | Decoding natural language from non-invasive EEG signals is a promising yet challenging task. However, current state-of-the-art models remain constrained by three fundamental issues: Semantic Bias, where outputs collapse into generic linguistic templates; Signal Neglect, where models rely heavily on LLM priors to hallucinate fluent text even in the absence of meaningful signals; and the "BLEU Trap", where high-frequency stopwords inflate n-gram metrics, masking a lack of true semantic fidelity. To resolve these challenges, we move beyond conventional end-to-end pipelines and propose SemKey, a novel multi-stage framework that enforces signal-grounded generation through four decoupled semantic objectives: sentiment, topic, length, and surprisal. We extract these semantic anchors from EEG embeddings directly, then unify them with an Active Retrieval Decoding mechanism, compelling the LLM to ground its token generation in the neural signals rather than defaulting to linguistic priors. Furthermore, we break the BLEU Trap by establishing a comprehensive evaluation protocol using rigorous retrieval and distribution-based metrics such as Fréchet Distance. Extensive experiments demonstrate that SemKey effectively mitigates hallucinations on noise inputs and achieves SOTA performance on these robust protocols. Code will be released upon acceptance at https://github.com/xmed-lab/SemKey. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2603_03312 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding Wang, Yuchen Wang, Haonan Guo, Yu Yang, Honglong Li, Xiaomeng Computation and Language Artificial Intelligence Human-Computer Interaction Audio and Speech Processing Neurons and Cognition Decoding natural language from non-invasive EEG signals is a promising yet challenging task. However, current state-of-the-art models remain constrained by three fundamental issues: Semantic Bias, where outputs collapse into generic linguistic templates; Signal Neglect, where models rely heavily on LLM priors to hallucinate fluent text even in the absence of meaningful signals; and the "BLEU Trap", where high-frequency stopwords inflate n-gram metrics, masking a lack of true semantic fidelity. To resolve these challenges, we move beyond conventional end-to-end pipelines and propose SemKey, a novel multi-stage framework that enforces signal-grounded generation through four decoupled semantic objectives: sentiment, topic, length, and surprisal. We extract these semantic anchors from EEG embeddings directly, then unify them with an Active Retrieval Decoding mechanism, compelling the LLM to ground its token generation in the neural signals rather than defaulting to linguistic priors. Furthermore, we break the BLEU Trap by establishing a comprehensive evaluation protocol using rigorous retrieval and distribution-based metrics such as Fréchet Distance. Extensive experiments demonstrate that SemKey effectively mitigates hallucinations on noise inputs and achieves SOTA performance on these robust protocols. Code will be released upon acceptance at https://github.com/xmed-lab/SemKey. |
| title | Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text Decoding |
| topic | Computation and Language Artificial Intelligence Human-Computer Interaction Audio and Speech Processing Neurons and Cognition |
| url | https://arxiv.org/abs/2603.03312 |