Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Adeola, Maximus
Format:	Recurso digital
Language:
Published:	Zenodo 2026
Online Access:	https://doi.org/10.5281/zenodo.18502053
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866901345528709120
author	Adeola, Maximus
author_facet	Adeola, Maximus
contents	<p>Conversational systems built on Large Language Models (LLMs) face an escalating chal- lenge: as dialogue history grows, context windows expand exponentially, drastically increas- ing inference costs with each new message. Compounding this issue is the quadratic com- plexity of self-attention mechanisms (O(N 2)), which limits the practical context capacity of even state-of-the-art models. I present CRAiG (Contextual Retrieval Augmented Genera- tion), a novel architecture in which a lightweight External Attention Mechanism (EAM)—a 43 million parameter model—is trained to operate atop any generative LLM, intelligently curating the most relevant context for each prompt. By decoupling context selection from generation, CRAiG enables models to handle large conversational histories (up to 3.6 mil- lion tokens) while processing only a constant, manageable subset of information at inference time. Through a three-stage training process incorporating teacher-supervised learning, Se- mantic Phase Shift Augmentation (SPSA), and Natural Language Inference (NLI) optimiza- tion, CRAiG achieved a 68.53% accuracy on LongBench v2, surpassing state-of-the-art commercial models including Gemini 3 Pro (65.6%) and Claude Sonnet 4.5 (61.8%), while reducing token consumption by up to 93%. My approach demonstrates exceptional perfor- mance on domain-specific tasks, reaching 79.59% accuracy on code repository understanding and 75.31% on long in-context learning. The entire research project, from data collection to final training, cost under $19 USD, demonstrating the cost-effectiveness and accessibility of this method.</p>
format	Recurso digital
id	zenodo_https___doi_org_10_5281_zenodo_18502053
institution	Zenodo
language
publishDate	2026
publisher	Zenodo
record_format	zenodo
spellingShingle	CRAiG: Contextual Retrieval Augmented Generation Adeola, Maximus <p>Conversational systems built on Large Language Models (LLMs) face an escalating chal- lenge: as dialogue history grows, context windows expand exponentially, drastically increas- ing inference costs with each new message. Compounding this issue is the quadratic com- plexity of self-attention mechanisms (O(N 2)), which limits the practical context capacity of even state-of-the-art models. I present CRAiG (Contextual Retrieval Augmented Genera- tion), a novel architecture in which a lightweight External Attention Mechanism (EAM)—a 43 million parameter model—is trained to operate atop any generative LLM, intelligently curating the most relevant context for each prompt. By decoupling context selection from generation, CRAiG enables models to handle large conversational histories (up to 3.6 mil- lion tokens) while processing only a constant, manageable subset of information at inference time. Through a three-stage training process incorporating teacher-supervised learning, Se- mantic Phase Shift Augmentation (SPSA), and Natural Language Inference (NLI) optimiza- tion, CRAiG achieved a 68.53% accuracy on LongBench v2, surpassing state-of-the-art commercial models including Gemini 3 Pro (65.6%) and Claude Sonnet 4.5 (61.8%), while reducing token consumption by up to 93%. My approach demonstrates exceptional perfor- mance on domain-specific tasks, reaching 79.59% accuracy on code repository understanding and 75.31% on long in-context learning. The entire research project, from data collection to final training, cost under $19 USD, demonstrating the cost-effectiveness and accessibility of this method.</p>
title	CRAiG: Contextual Retrieval Augmented Generation
url	https://doi.org/10.5281/zenodo.18502053

Similar Items