Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lin, Jiaen, Liu, Jingyu, Liu, Yingbo
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence Information Retrieval
Online Access:	https://arxiv.org/abs/2503.04796
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917231613444096
author	Lin, Jiaen Liu, Jingyu Liu, Yingbo
author_facet	Lin, Jiaen Liu, Jingyu Liu, Yingbo
contents	Retrieval-augmented generation (RAG) encounters challenges when addressing complex queries, particularly multi-hop questions. While several methods tackle multi-hop queries by iteratively generating internal queries and retrieving external documents, these approaches are computationally expensive. In this paper, we identify a three-stage information processing pattern in LLMs during layer-by-layer reasoning, consisting of extraction, processing, and subsequent extraction steps. This observation suggests that the representations in intermediate layers contain richer information compared to those in other layers. Building on this insight, we propose Layer-wise RAG (L-RAG). Unlike prior methods that focus on generating new internal queries, L-RAG leverages intermediate representations from the middle layers, which capture next-hop information, to retrieve external knowledge. L-RAG achieves performance comparable to multi-step approaches while maintaining inference overhead similar to that of standard RAG. Experimental results show that L-RAG outperforms existing RAG methods on open-domain multi-hop question-answering datasets, including MuSiQue, HotpotQA, and 2WikiMultiHopQA. The code is available in https://github.com/Olive-2019/L-RAG
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_04796
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Optimizing Multi-Hop Document Retrieval Through Intermediate Representations Lin, Jiaen Liu, Jingyu Liu, Yingbo Computation and Language Artificial Intelligence Information Retrieval Retrieval-augmented generation (RAG) encounters challenges when addressing complex queries, particularly multi-hop questions. While several methods tackle multi-hop queries by iteratively generating internal queries and retrieving external documents, these approaches are computationally expensive. In this paper, we identify a three-stage information processing pattern in LLMs during layer-by-layer reasoning, consisting of extraction, processing, and subsequent extraction steps. This observation suggests that the representations in intermediate layers contain richer information compared to those in other layers. Building on this insight, we propose Layer-wise RAG (L-RAG). Unlike prior methods that focus on generating new internal queries, L-RAG leverages intermediate representations from the middle layers, which capture next-hop information, to retrieve external knowledge. L-RAG achieves performance comparable to multi-step approaches while maintaining inference overhead similar to that of standard RAG. Experimental results show that L-RAG outperforms existing RAG methods on open-domain multi-hop question-answering datasets, including MuSiQue, HotpotQA, and 2WikiMultiHopQA. The code is available in https://github.com/Olive-2019/L-RAG
title	Optimizing Multi-Hop Document Retrieval Through Intermediate Representations
topic	Computation and Language Artificial Intelligence Information Retrieval
url	https://arxiv.org/abs/2503.04796

Similar Items