Salvato in:
| Autori principali: | , , |
|---|---|
| Natura: | Preprint |
| Pubblicazione: |
2024
|
| Soggetti: | |
| Accesso online: | https://arxiv.org/abs/2410.03754 |
| Tags: |
Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
|
| _version_ | 1866909338174488576 |
|---|---|
| author | Shah, Keyush Goyal, Abhishek Wasserman, Isaac |
| author_facet | Shah, Keyush Goyal, Abhishek Wasserman, Isaac |
| contents | Retrieval augmented generation (RAG) has become the standard in long context question answering (QA) systems. However, typical implementations of RAG rely on a rather naive retrieval mechanism, in which texts whose embeddings are most similar to that of the query are deemed most relevant. This has consequences in subjective QA tasks, where the most relevant text may not directly contain the answer. In this work, we propose a novel extension to RAG systems, which we call Retrieval from AI Derived Documents (RAIDD). RAIDD leverages the full power of the LLM in the retrieval process by deriving inferred features, such as summaries and example questions, from the documents at ingest. We demonstrate that this approach significantly improves the performance of RAG systems on long-context QA tasks. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2410_03754 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Enhancing Retrieval in QA Systems with Derived Feature Association Shah, Keyush Goyal, Abhishek Wasserman, Isaac Computation and Language Information Retrieval Retrieval augmented generation (RAG) has become the standard in long context question answering (QA) systems. However, typical implementations of RAG rely on a rather naive retrieval mechanism, in which texts whose embeddings are most similar to that of the query are deemed most relevant. This has consequences in subjective QA tasks, where the most relevant text may not directly contain the answer. In this work, we propose a novel extension to RAG systems, which we call Retrieval from AI Derived Documents (RAIDD). RAIDD leverages the full power of the LLM in the retrieval process by deriving inferred features, such as summaries and example questions, from the documents at ingest. We demonstrate that this approach significantly improves the performance of RAG systems on long-context QA tasks. |
| title | Enhancing Retrieval in QA Systems with Derived Feature Association |
| topic | Computation and Language Information Retrieval |
| url | https://arxiv.org/abs/2410.03754 |