Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Makri, Eftychia, Nakis, Nikolaos, Sisson, Laura, Minsky, Gigi, Tassiulas, Leandros, Satarifard, Vahid, Christakis, Nicholas A.
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computation and Language Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2604.00002
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866912993147617280
author	Makri, Eftychia Nakis, Nikolaos Sisson, Laura Minsky, Gigi Tassiulas, Leandros Satarifard, Vahid Christakis, Nicholas A.
author_facet	Makri, Eftychia Nakis, Nikolaos Sisson, Laura Minsky, Gigi Tassiulas, Leandros Satarifard, Vahid Christakis, Nicholas A.
contents	Here we introduce the Olfactory Perception (OP) benchmark, designed to assess the capability of large language models (LLMs) to reason about smell. The benchmark contains 1,010 questions across eight task categories spanning odor classification, odor primary descriptor identification, intensity and pleasantness judgments, multi-descriptor prediction, mixture similarity, olfactory receptor activation, and smell identification from real-world odor sources. Each question is presented in two prompt formats, compound names and isomeric SMILES, to evaluate the effect of molecular representations. Evaluating 21 model configurations across major model families, we find that compound-name prompts consistently outperform isomeric SMILES, with gains ranging from +2.4 to +18.9 percentage points (mean approx +7 points), suggesting current LLMs access olfactory knowledge primarily through lexical associations rather than structural molecular reasoning. The best-performing model reaches 64.4\% overall accuracy, which highlights both emerging capabilities and substantial remaining gaps in olfactory reasoning. We further evaluate a subset of the OP across 21 languages and find that aggregating predictions across languages improves olfactory prediction, with AUROC = 0.86 for the best performing language ensemble model. LLMs should be able to handle olfactory and not just visual or aural information.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_00002
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Benchmark for Assessing Olfactory Perception of Large Language Models Makri, Eftychia Nakis, Nikolaos Sisson, Laura Minsky, Gigi Tassiulas, Leandros Satarifard, Vahid Christakis, Nicholas A. Computation and Language Artificial Intelligence Here we introduce the Olfactory Perception (OP) benchmark, designed to assess the capability of large language models (LLMs) to reason about smell. The benchmark contains 1,010 questions across eight task categories spanning odor classification, odor primary descriptor identification, intensity and pleasantness judgments, multi-descriptor prediction, mixture similarity, olfactory receptor activation, and smell identification from real-world odor sources. Each question is presented in two prompt formats, compound names and isomeric SMILES, to evaluate the effect of molecular representations. Evaluating 21 model configurations across major model families, we find that compound-name prompts consistently outperform isomeric SMILES, with gains ranging from +2.4 to +18.9 percentage points (mean approx +7 points), suggesting current LLMs access olfactory knowledge primarily through lexical associations rather than structural molecular reasoning. The best-performing model reaches 64.4\% overall accuracy, which highlights both emerging capabilities and substantial remaining gaps in olfactory reasoning. We further evaluate a subset of the OP across 21 languages and find that aggregating predictions across languages improves olfactory prediction, with AUROC = 0.86 for the best performing language ensemble model. LLMs should be able to handle olfactory and not just visual or aural information.
title	Benchmark for Assessing Olfactory Perception of Large Language Models
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2604.00002

Ähnliche Einträge