Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Schulz, Steffen
Format:	Preprint
Published:	2025
Subjects:	Information Retrieval
Online Access:	https://arxiv.org/abs/2505.01442
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915271100334080
author	Schulz, Steffen
author_facet	Schulz, Steffen
contents	The evaluation of new algorithms in recommender systems frequently depends on publicly available datasets, such as those from MovieLens or Amazon. Some of these datasets are being disproportionately utilized primarily due to their historical popularity as baselines rather than their suitability for specific research contexts. This thesis addresses this issue by introducing the Algorithm Performance Space, a novel framework designed to differentiate datasets based on the measured performance of algorithms applied to them. An experimental study proposes three metrics to quantify and justify dataset selection to evaluate new algorithms. These metrics also validate assumptions about datasets, such as the similarity between MovieLens datasets of varying sizes. By creating an Algorithm Performance Space and using the proposed metrics, differentiating datasets was made possible, and diverse dataset selections could be found. While the results demonstrate the framework's potential, further research proposals and implications are discussed to develop Algorithm Performance Spaces tailored to diverse use cases.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_01442
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Algorithm Performance Spaces for Strategic Dataset Selection Schulz, Steffen Information Retrieval The evaluation of new algorithms in recommender systems frequently depends on publicly available datasets, such as those from MovieLens or Amazon. Some of these datasets are being disproportionately utilized primarily due to their historical popularity as baselines rather than their suitability for specific research contexts. This thesis addresses this issue by introducing the Algorithm Performance Space, a novel framework designed to differentiate datasets based on the measured performance of algorithms applied to them. An experimental study proposes three metrics to quantify and justify dataset selection to evaluate new algorithms. These metrics also validate assumptions about datasets, such as the similarity between MovieLens datasets of varying sizes. By creating an Algorithm Performance Space and using the proposed metrics, differentiating datasets was made possible, and diverse dataset selections could be found. While the results demonstrate the framework's potential, further research proposals and implications are discussed to develop Algorithm Performance Spaces tailored to diverse use cases.
title	Algorithm Performance Spaces for Strategic Dataset Selection
topic	Information Retrieval
url	https://arxiv.org/abs/2505.01442

Similar Items