Guardado en:
| Autores principales: | , , , , , |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2508.19399 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| _version_ | 1866914008624267264 |
|---|---|
| author | Vente, Tobias Heep, Michael Abbas, Abdullah Sperle, Theodor Beel, Joeran Goethals, Bart |
| author_facet | Vente, Tobias Heep, Michael Abbas, Abdullah Sperle, Theodor Beel, Joeran Goethals, Bart |
| contents | Dataset selection is crucial for offline recommender system experiments, as mismatched data (e.g., sparse interaction scenarios require datasets with low user-item density) can lead to unreliable results. Yet, 86\% of ACM RecSys 2024 papers provide no justification for their dataset choices, with most relying on just four datasets: Amazon (38\%), MovieLens (34\%), Yelp (15\%), and Gowalla (12\%). While Algorithm Performance Spaces (APS) were proposed to guide dataset selection, their adoption has been limited due to the absence of an intuitive, interactive tool for APS exploration. Therefore, we introduce the APS Explorer, a web-based visualization tool for interactive APS exploration, enabling data-driven dataset selection. The APS Explorer provides three interactive features: (1) an interactive PCA plot showing dataset similarity via performance patterns, (2) a dynamic meta-feature table for dataset comparisons, and (3) a specialized visualization for pairwise algorithm performance. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2508_19399 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | APS Explorer: Navigating Algorithm Performance Spaces for Informed Dataset Selection Vente, Tobias Heep, Michael Abbas, Abdullah Sperle, Theodor Beel, Joeran Goethals, Bart Information Retrieval Dataset selection is crucial for offline recommender system experiments, as mismatched data (e.g., sparse interaction scenarios require datasets with low user-item density) can lead to unreliable results. Yet, 86\% of ACM RecSys 2024 papers provide no justification for their dataset choices, with most relying on just four datasets: Amazon (38\%), MovieLens (34\%), Yelp (15\%), and Gowalla (12\%). While Algorithm Performance Spaces (APS) were proposed to guide dataset selection, their adoption has been limited due to the absence of an intuitive, interactive tool for APS exploration. Therefore, we introduce the APS Explorer, a web-based visualization tool for interactive APS exploration, enabling data-driven dataset selection. The APS Explorer provides three interactive features: (1) an interactive PCA plot showing dataset similarity via performance patterns, (2) a dynamic meta-feature table for dataset comparisons, and (3) a specialized visualization for pairwise algorithm performance. |
| title | APS Explorer: Navigating Algorithm Performance Spaces for Informed Dataset Selection |
| topic | Information Retrieval |
| url | https://arxiv.org/abs/2508.19399 |