We are excited to share that our PhD student, Tobias Vente, presented our research paper, APS Explorer: Navigating Algorithm Performance Spaces for Informed Dataset Selection, at the ACM RecSys 2025 conference held at the O2 Universum Convention Center in Prague, Czechia

The Problem: Why RecSys Needs Better Dataset Selection

An analysis of all full papers accepted at the ACM RecSys 2024 showed that 86% of full papers offered no explanation for why their chosen datasets are suitable for their specific experiments! The result? An overwhelming focus on just four kinds of datasets: Amazon (38%), MovieLens (34%), Yelp (15%), and Gowalla (12%).

Selecting a dataset that doesn’t match the experimental scenario (e.g., investigating the
reduction of bias in datasets requires datasets with empirically present bias) can lead to misleading results, hindering true progress by potentially limiting the diversity and robustness of research findings. Tobias explained that while the concept of Algorithm Performance Spaces (APS) offered a theoretical solution, the lack of a practical tool prevented its widespread adoption.

APS Explorer: Navigating Algorithm Performance Spaces for Informed Dataset Selection

This is precisely where the APS Explorer comes in. In his demo, Tobias showcased the innovative web-based APS Explorer, designed to transform algorithm performance spaces from an abstract concept into an interactive, visual guide for every practitioner.

His live demonstration focused on three powerful features that captivated the audience:

  1. The Interactive PCA Plot: Tobias showed how the tool visually maps datasets based on performance similarity, allowing users to see which datasets form natural clusters and identify suitable alternatives or diverse datasets at a glance.
  2. The Dynamic Meta-Feature Table: He demonstrated how to directly compare key dataset characteristics like density and sparsity, ensuring the data’s intrinsic properties align with the research goals.
  3. The Pairwise Performance Visualizer: This feature allowed users to dive deep into how specific algorithms compare head-to-head across the entire landscape of datasets, moving beyond simple rankings to uncover nuanced performance patterns.

Why This Matters: The Impact of Data-Driven Choices

Tobias’s presentation made a compelling case for a more rigorous approach to experimentation. By using the APS Explorer, the RecSys community can:

  • Strengthen Research Validity: Justify dataset selections with concrete, data-driven evidence.
  • Improve Reproducibility: Facilitate better comparisons between studies by using datasets with similar performance characteristics.
  • Accelerate Innovation: Move beyond the “usual suspects” to discover new, well-suited datasets for cutting-edge algorithms.

The positive reception and engaging discussion underscored the tool’s potential to become a cornerstone of robust RecSys research.

Presenting the APS Explorer at RecSys 2025 in the beautiful city of Prague was a significant step forward in our mission to equip the community with the tools needed for smarter, more reliable, and more impactful RecSys tools.

Want to try it yourself?
Explore the live tool and learn more about the research behind it here: https://datasets.recommender-systems.com/


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *