WEGMETH, Lukas (Scientific Staff)

Beel, Joeran; Gipp, Bela; Jannach, Dietmar; Said, Alan; Wegmeth, Lukas; Vente, Tobias

Checky, the Paper-Submission Checklist Generator for Authors, Reviewers and LLMs Proceedings Article

In: 47th European Conference on Information Retrieval (ECIR), pp. 10–15, Springer, 2025.

Links | BibTeX

Beel, Joeran; Said, Alan; Vente, Tobias; Wegmeth, Lukas

Green Recommender Systems: A Call for Attention Journal Article

In: ACM SIGIR Forum, vol. 58, no. 2, pp. 1–5, 2024.

Links | BibTeX

Beel, Joeran; Jannach, Dietmar; Said, Alan; Shani, Guy; Vente, Tobias; Wegmeth, Lukas

Best-Practices for Offline Evaluations of Recommender Systems Proceedings Article

In: Bauer, Christine; Said, Alan; Zangerle, Eva (Ed.): Report from Dagstuhl Seminar 24211 – Evaluation Perspectives of Recommender Systems: Driving Research and Education, 2024.

BibTeX

Baumgart, Moritz; Wegmeth, Lukas; Vente, Tobias; Beel, Joeran

e-Fold Cross-Validation for Recommender-System Evaluation Proceedings Article

In: International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood) at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.

Links | BibTeX

Beel, Joeran; Wegmeth, Lukas; Vente, Tobias

E-fold Cross-validation: A Computing and Energy-efficient Alternative to K-fold Cross-validation with Adaptive Folds [Proposal] Journal Article

In: OSF Preprints, 2024.

Links | BibTeX

Wegmeth, Lukas; Vente, Tobias; Said, Alan; Beel, Joeran

EMERS: Energy Meter for Recommender Systems Proceedings Article

In: International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood) at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.

Links | BibTeX

Vente, Tobias; Wegmeth, Lukas; Said, Alan; Beel, Joeran

From Clicks to Carbon: The Environmental Toll of Recommender Systems Proceedings Article

In: Proceedings of the 18th ACM Conference on Recommender Systems, pp. 580–590, Association for Computing Machinery, Bari, Italy, 2024, ISBN: 9798400705052.

Abstract | Links | BibTeX

Vente, Tobias; Mehta, Zainil; Wegmeth, Lukas; Beel, Joeran

Greedy Ensemble Selection for Top-N Recommendations Proceedings Article

In: RobustRecSys Workshop at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.

BibTeX

Beel, Joeran; Wegmeth, Lukas; Michiels, Lien; Schulz, Steffen

Informed Dataset Selection with ‘Algorithm Performance Spaces’ Proceedings Article

In: 18th ACM Conference on Recommender Systems, pp. 1085–1090, Association for Computing Machinery, Bari, Italy, 2024, ISBN: 9798400705052.

Abstract | Links | BibTeX

@inproceedings{Beel2024b,

title = {Informed Dataset Selection with ‘Algorithm Performance Spaces’},

author = {Joeran Beel and Lukas Wegmeth and Lien Michiels and Steffen Schulz},

url = {https://doi.org/10.1145/3640457.3691704 

https://isg.beel.org/blog/2024/09/01/informed-dataset-selection-with-algorithm-performance-spaces/},

doi = {10.1145/3640457.3691704},

isbn = {9798400705052},

year  = {2024},

date = {2024-01-01},

booktitle = {18th ACM Conference on Recommender Systems},

pages = {1085–1090},

publisher = {Association for Computing Machinery},

address = {Bari, Italy},

series = {RecSys '24},

abstract = {When designing recommender-systems experiments, a key question that has been largely overlooked is the choice of datasets. In a brief survey of ACM RecSys papers, we found that authors typically justified their dataset choices by labelling them as public, benchmark, or ‘real-world’ without further explanation. We propose the Algorithm Performance Space (APS) as a novel method for informed dataset selection. The APS is an n-dimensional space where each dimension represents the performance of a different algorithm. Each dataset is depicted as an n-dimensional vector, with greater distances indicating higher diversity. In our experiment, we ran 29 algorithms on 95 datasets to construct an actual APS. Our findings show that many datasets, including most Amazon datasets, are clustered closely in the APS, i.e. they are not diverse. However, other datasets, such as MovieLens and Docear, are more dispersed. The APS also enables the grouping of datasets based on the solvability of the underlying problem. Datasets in the top right corner of the APS are considered ’solved problems’ because all algorithms perform well on them. Conversely, datasets in the bottom left corner lack well-performing algorithms, making them ideal candidates for new recommender-system research due to the challenges they present.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Wegmeth, Lukas; Vente, Tobias; Beel, Joeran

Recommender Systems Algorithm Selection for Ranking Prediction on Implicit Feedback Datasets Proceedings Article

In: 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.

Links | BibTeX

Meister, Philipp; Wegmeth, Lukas; Vente, Tobias; Beel, Joeran

Removing Bad Influence: Identifying and Pruning Detrimental Users in Collaborative Filtering Recommender Systems Proceedings Article

In: RobustRecSys Workshop at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.

BibTeX

Wegmeth, Lukas; Vente, Tobias; Purucker, Lennart

Revealing the Hidden Impact of Top-N Metrics on Optimization in Recommender Systems Proceedings Article

In: Goharian, Nazli; Tonellotto, Nicola; He, Yulan; Lipani, Aldo; McDonald, Graham; Macdonald, Craig; Ounis, Iadh (Ed.): 46th European Conference on Information Retrieval (ECIR), pp. 140–156, Springer Nature Switzerland, Cham, 2024, ISBN: 978-3-031-56027-9.

Abstract | Links | BibTeX

@inproceedings{Wegmeth2024b,

title = {Revealing the Hidden Impact of Top-N Metrics on Optimization in Recommender Systems},

author = {Lukas Wegmeth and Tobias Vente and Lennart Purucker},

editor = {Nazli Goharian and Nicola Tonellotto and Yulan He and Aldo Lipani and Graham McDonald and Craig Macdonald and Iadh Ounis},

url = {https://link.springer.com/chapter/10.1007/978-3-031-56027-9_9 

https://arxiv.org/pdf/2401.08444},

doi = {10.1007/978-3-031-56027-9_9},

isbn = {978-3-031-56027-9},

year  = {2024},

date = {2024-01-01},

booktitle = {46th European Conference on Information Retrieval (ECIR)},

pages = {140–156},

publisher = {Springer Nature Switzerland},

address = {Cham},

abstract = {The hyperparameters of recommender systems for top-n predictions are typically optimized to enhance the predictive performance of algorithms. Thereby, the optimization algorithm, e.g., grid search or random search, searches for the best hyperparameter configuration according to an optimization-target metric, like nDCG or Precision. In contrast, the optimized algorithm, e.g., Alternating Least Squares Matrix Factorization or Bayesian Personalized Ranking, internally optimizes a different loss function during training, like squared error or cross-entropy. To tackle this discrepancy, recent work focused on generating loss functions better suited for recommender systems. Yet, when evaluating an algorithm using a top-n metric during optimization, another discrepancy between the optimization-target metric and the training loss has so far been ignored. During optimization, the top-n items are selected for computing a top-n metric; ignoring that the top-n items are selected from the recommendations of a model trained with an entirely different loss function. Item recommendations suitable for optimization-target metrics could be outside the top-n recommended items; hiddenly impacting the optimization performance. Therefore, we were motivated to analyze whether the top-n items are optimal for optimization-target top-n metrics. In pursuit of an answer, we exhaustively evaluate the predictive performance of 250 selection strategies besides selecting the top-n. We extensively evaluate each selection strategy over twelve implicit feedback and eight explicit feedback data sets with eleven recommender systems algorithms. Our results show that there exist selection strategies other than top-n that increase predictive performance for various algorithms and recommendation domains. However, the performance of the top $$backslashsim 43backslash%$$∼43%of selection strategies is not significantly different. We discuss the impact of our findings on optimization and re-ranking in recommender systems and feasible solutions. The implementation of our study is publicly available.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

The hyperparameters of recommender systems for top-n predictions are typically optimized to enhance the predictive performance of algorithms. Thereby, the optimization algorithm, e.g., grid search or random search, searches for the best hyperparameter configuration according to an optimization-target metric, like nDCG or Precision. In contrast, the optimized algorithm, e.g., Alternating Least Squares Matrix Factorization or Bayesian Personalized Ranking, internally optimizes a different loss function during training, like squared error or cross-entropy. To tackle this discrepancy, recent work focused on generating loss functions better suited for recommender systems. Yet, when evaluating an algorithm using a top-n metric during optimization, another discrepancy between the optimization-target metric and the training loss has so far been ignored. During optimization, the top-n items are selected for computing a top-n metric; ignoring that the top-n items are selected from the recommendations of a model trained with an entirely different loss function. Item recommendations suitable for optimization-target metrics could be outside the top-n recommended items; hiddenly impacting the optimization performance. Therefore, we were motivated to analyze whether the top-n items are optimal for optimization-target top-n metrics. In pursuit of an answer, we exhaustively evaluate the predictive performance of 250 selection strategies besides selecting the top-n. We extensively evaluate each selection strategy over twelve implicit feedback and eight explicit feedback data sets with eleven recommender systems algorithms. Our results show that there exist selection strategies other than top-n that increase predictive performance for various algorithms and recommendation domains. However, the performance of the top $$backslashsim 43backslash%$$∼43%of selection strategies is not significantly different. We discuss the impact of our findings on optimization and re-ranking in recommender systems and feasible solutions. The implementation of our study is publicly available.

Close

Wegmeth, Lukas

Improving Recommender Systems Through the Automation of Design Decisions Proceedings Article

In: Proceedings of the 17th ACM Conference on Recommender Systems, pp. 1332-1338, 2023.

Abstract | Links | BibTeX

@inproceedings{Wegmeth2023a,

title = {Improving Recommender Systems Through the Automation of Design Decisions},

author = {Lukas Wegmeth},

url = {https://dl.acm.org/doi/pdf/10.1145/3604915.3608877},

year  = {2023},

date = {2023-01-01},

booktitle = {Proceedings of the 17th ACM Conference on Recommender Systems},

pages = {1332-1338},

abstract = {Recommender systems developers are constantly faced with difficult design decisions. Additionally, the number of options that a recommender systems developer has to consider continually grows over time with new innovations. The machine learning community is in a similar situation and has come together to tackle the problem. They invented concepts and tools to make machine learning development both easier and faster. These developments are categorized as automated machine learning (AutoML). As a result, the AutoML community formed and continuously innovates new approaches. Inspired by AutoML, the recommender systems community has recently understood the need for automation and sparsely introduced AutoRecSys. The goal of AutoRecSys is not to replace recommender systems developers but to improve performance through the automation of design decisions. With AutoRecSys, recommender systems engineers do not have to focus on easy but time-consuming tasks and are free to pursue difficult engineering tasks instead. Additionally, AutoRecSys enables easier access to recommender systems for beginners as it reduces the amount of knowledge required to get started with the development of recommender systems. AutoRecSys, like AutoML, is still early in its development and does not yet cover the whole development pipeline. Additionally, it is not yet clear, under which circumstances AutoML approaches can be transferred to recommender systems. Our research intends to close this gap by improving AutoRecSys both with regard to the transfer of AutoML and novel approaches. Furthermore, we focus specifically on the development of novel automation approaches for data processing and training. We note that the realization of AutoRecSys is going to be a community effort. Our part in this effort is to research AutoRecSys fundamentals, build practical tools for the community, raise awareness of the advantages of automation, and catalyze AutoRecSys development.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Wegmeth, Lukas; Vente, Tobias; Beel, Joeran

The Challenges of Algorithm Selection and Hyperparameter Optimization for Recommender Systems Journal Article

In: COSEAL Workshop 2023, 2023.

Links | BibTeX

Wegmeth, Lukas; Vente, Tobias; Purucker, Lennart; Beel, Joeran

The Effect of Random Seeds for Data Splitting on Recommendation Accuracy Proceedings Article

In: Proceedings of the 3rd Perspectives on the Evaluation of Recommender Systems Workshop, 2023.

Abstract | Links | BibTeX

@inproceedings{Wegmeth2023,

title = {The Effect of Random Seeds for Data Splitting on Recommendation Accuracy},

author = {Lukas Wegmeth and Tobias Vente and Lennart Purucker and Joeran Beel},

url = {https://ceur-ws.org/Vol-3476/paper4.pdf},

year  = {2023},

date = {2023-01-01},

booktitle = {Proceedings of the 3rd Perspectives on the Evaluation of Recommender Systems Workshop},

abstract = {The evaluation of recommender system algorithms depends on randomness, e.g., during randomly splitting data into training and testing data. We suspect that failing to account for randomness in this scenario may lead to misrepresenting the predictive accuracy of recommendation algorithms. To understand the community’s view of the importance of randomness, we conducted a paper study on 39 full papers published at the ACM RecSys 2022 conference. We found that the authors of 26 papers used some variation of a holdout split that requires a random seed. However, only five papers explicitly repeated experiments and averaged their results over different random seeds. This potentially problematic research practice motivated us to analyze the effect of data split random seeds on recommendation accuracy. Therefore, we train three common algorithms on nine public data sets with 20 data split random seeds, evaluate them on two ranking metrics with three different ranking cutoff values k, and compare the results. In the extreme case with k = 1, we show that depending on the data split random seed, the accuracy with traditional recommendation algorithms deviates by up to ∼6.3% from the mean accuracy achieved on the data set. Hence, we show that an algorithm may significantly over- or under-perform when maliciously or negligently selecting a random seed for splitting the data. To showcase a mitigation strategy and better research practice, we compare holdout to cross-validation and show that, again, for k = 1, the accuracy of algorithms evaluated with cross-validation deviates only up to ∼2.3% from the mean accuracy achieved on the data set. Furthermore, we found that the deviation becomes smaller the higher the value of k for both holdout and cross-validation.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Wegmeth, Lukas; Beel, Joeran

CaMeLS: Cooperative Meta-Learning Service for Recommender Systems Proceedings Article

In: Proceedings of the 2nd Perspectives on the Evaluation of Recommender Systems Workshop, 2022.

Abstract | Links | BibTeX

Wegmeth, Lukas; Beel, Joeran

Cooperative Meta-Learning Service for Recommender Systems Journal Article

In: COSEAL Workshop 2022, 2022.

Links | BibTeX

Wegmeth, Lukas

The Impact of Feature Quantity on Recommendation Algorithm Performance: A Movielens-100K Case Study Proceedings Article

In: arXiv:2207.08713, 2022.

Abstract | Links | BibTeX

@inproceedings{Wegmeth2022b,

title = {The Impact of Feature Quantity on Recommendation Algorithm Performance: A Movielens-100K Case Study},

author = {Lukas Wegmeth},

url = {https://arxiv.org/pdf/2207.08713.pdf},

year  = {2022},

date = {2022-01-01},

booktitle = {arXiv:2207.08713},

abstract = {Recent model-based Recommender Systems (RecSys) algorithms emphasize on the use of features, also called side information, in their design similar to algorithms in Machine Learning (ML). In contrast, some of the most popular and traditional algorithms for RecSys solely focus on a given user-item-rating relation without including side information. An important category of these is matrix factorization-based algorithms, e.g., Singular Value Decomposition and Alternating Least Squares, which are known to have high performance on RecSys data sets. The goal of this case study is to provide a performance comparison and assessment of RecSys and ML algorithms when side information is included. We chose the Movielens-100K data set since it is a standard for comparing RecSys algorithms. We compared six different feature sets with varying quantities of features which were generated from the baseline data and evaluated on a total of 19 RecSys algorithms, baseline ML algorithms, Automated Machine Learning (AutoML) pipelines, and state-of-the-art RecSys algorithms that incorporate side information. The results show that additional features benefit all algorithms we evaluated. However, the correlation between feature quantity and performance is not monotonous for AutoML and RecSys. In these categories, an analysis of feature importance revealed that the quality of features matters more than quantity. Throughout our experiments, the average performance on the feature set with the lowest number of features is ∼6% worse compared to that with the highest in terms of the Root Mean Squared Error. An interesting observation is that AutoML outperforms matrix factorization-based RecSys algorithms when additional features are used. Almost all algorithms that can include side information have higher performance when using the highest quantity of features. In the other cases, the performance difference is negligible (<1%). The results show a clear positive trend for the effect of feature quantity as well as the important effects of feature quality on the evaluated algorithms.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Close

Recent model-based Recommender Systems (RecSys) algorithms emphasize on the use of features, also called side information, in their design similar to algorithms in Machine Learning (ML). In contrast, some of the most popular and traditional algorithms for RecSys solely focus on a given user-item-rating relation without including side information. An important category of these is matrix factorization-based algorithms, e.g., Singular Value Decomposition and Alternating Least Squares, which are known to have high performance on RecSys data sets. The goal of this case study is to provide a performance comparison and assessment of RecSys and ML algorithms when side information is included. We chose the Movielens-100K data set since it is a standard for comparing RecSys algorithms. We compared six different feature sets with varying quantities of features which were generated from the baseline data and evaluated on a total of 19 RecSys algorithms, baseline ML algorithms, Automated Machine Learning (AutoML) pipelines, and state-of-the-art RecSys algorithms that incorporate side information. The results show that additional features benefit all algorithms we evaluated. However, the correlation between feature quantity and performance is not monotonous for AutoML and RecSys. In these categories, an analysis of feature importance revealed that the quality of features matters more than quantity. Throughout our experiments, the average performance on the feature set with the lowest number of features is ∼6% worse compared to that with the highest in terms of the Root Mean Squared Error. An interesting observation is that AutoML outperforms matrix factorization-based RecSys algorithms when additional features are used. Almost all algorithms that can include side information have higher performance when using the highest quantity of features. In the other cases, the performance difference is negligible (<1%). The results show a clear positive trend for the effect of feature quantity as well as the important effects of feature quality on the evaluated algorithms.

Close

WEGMETH, Lukas (Scientific Staff)

Lukas Wegmeth
Ph.D. Student

Publications

2025

2024

2023

2022

WEGMETH, Lukas (Scientific Staff)

Lukas WegmethPh.D. Student

Publications

2025

2024

2023

2022

Lukas Wegmeth
Ph.D. Student