Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Erziev, K. O. T.
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2506.04535
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916780633489408
author	Erziev, K. O. T.
author_facet	Erziev, K. O. T.
contents	We propose that benchmarking LLMs on questions which have no reasonable answer actually isn't as silly as it sounds. We also present a benchmark that allows such testing and a method to modify the existing datasets, and discover that existing models demonstrate a performance far from the perfect on such questions. Our code and data artifacts are available at https://github.com/L3G5/impossible-bench
format	Preprint
id	arxiv_https___arxiv_org_abs_2506_04535
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	BSBench: will your LLM find the largest prime number? Erziev, K. O. T. Computation and Language We propose that benchmarking LLMs on questions which have no reasonable answer actually isn't as silly as it sounds. We also present a benchmark that allows such testing and a method to modify the existing datasets, and discover that existing models demonstrate a performance far from the perfect on such questions. Our code and data artifacts are available at https://github.com/L3G5/impossible-bench
title	BSBench: will your LLM find the largest prime number?
topic	Computation and Language
url	https://arxiv.org/abs/2506.04535

Similar Items