Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Vavken, Maks Požarnik, Ogrinc, Matevž, Eftimov, Tome, Seljak, Barbara Koroušić
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2603.09704
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917332773765120
author	Vavken, Maks Požarnik Ogrinc, Matevž Eftimov, Tome Seljak, Barbara Koroušić
author_facet	Vavken, Maks Požarnik Ogrinc, Matevž Eftimov, Tome Seljak, Barbara Koroušić
contents	In this article, we evaluate four Large Language Models (LLMs) and their effectiveness at retrieving data within a specialized Retrieval-Augmented Generation (RAG) system, using a comprehensive food composition database. Our method is focused on the LLMs ability to translate natural language queries into structured metadata filters, enabling efficient retrieval via a Chroma vector database. By achieving high accuracy in this critical retrieval step, we demonstrate that LLMs can serve as an accessible, high-performance tool, drastically reducing the manual effort and technical expertise previously required for domain experts, such as food compilers and nutritionists, to leverage complex food and nutrition data. However, despite the high performance on easy and moderately complex queries, our analysis of difficult questions reveals that reliable retrieval remains challenging when queries involve non-expressible constraints. These findings demonstrate that LLM-driven metadata filtering excels when constraints can be explicitly expressed, but struggles when queries exceed the representational scope of the metadata format.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_09704
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Evaluation of LLMs in retrieving food and nutritional context for RAG systems Vavken, Maks Požarnik Ogrinc, Matevž Eftimov, Tome Seljak, Barbara Koroušić Computation and Language In this article, we evaluate four Large Language Models (LLMs) and their effectiveness at retrieving data within a specialized Retrieval-Augmented Generation (RAG) system, using a comprehensive food composition database. Our method is focused on the LLMs ability to translate natural language queries into structured metadata filters, enabling efficient retrieval via a Chroma vector database. By achieving high accuracy in this critical retrieval step, we demonstrate that LLMs can serve as an accessible, high-performance tool, drastically reducing the manual effort and technical expertise previously required for domain experts, such as food compilers and nutritionists, to leverage complex food and nutrition data. However, despite the high performance on easy and moderately complex queries, our analysis of difficult questions reveals that reliable retrieval remains challenging when queries involve non-expressible constraints. These findings demonstrate that LLM-driven metadata filtering excels when constraints can be explicitly expressed, but struggles when queries exceed the representational scope of the metadata format.
title	Evaluation of LLMs in retrieving food and nutritional context for RAG systems
topic	Computation and Language
url	https://arxiv.org/abs/2603.09704

Similar Items