Lukas Wegmeth
Ph.D. Student
Phone: +49 271 740-2591
Email: lukas.<last-name>@uni-siegen.de
Office: H-C 8318
Address: Office and postal address
Lukas Wegmeth is a Ph.D. Student of the Intelligent Systems Group at the University of Siegen. Before joining the ISG he completed his bachelor’s and master’s degree in Medical Computer Science at the University of Siegen. During his time as a graduate student, Lukas set his focus on the topic of Machine Learning and collaborated with different chairs of the University of Siegen to work on and release scientific research papers in the field.
Lukas will focus research towards his Ph.D. on Recommender Systems (RecSys) and Automated Machine Learning (AutoML).
Publications
2023
Wegmeth, Lukas
Improving Recommender Systems Through the Automation of Design Decisions Proceedings Article
In: Proceedings of the 17th ACM Conference on Recommender Systems, pp. 1332-1338, 2023.
@inproceedings{Wegmeth2023a,
title = {Improving Recommender Systems Through the Automation of Design Decisions},
author = {Lukas Wegmeth},
url = {https://dl.acm.org/doi/pdf/10.1145/3604915.3608877},
year = {2023},
date = {2023-01-01},
booktitle = {Proceedings of the 17th ACM Conference on Recommender Systems},
pages = {1332-1338},
abstract = {Recommender systems developers are constantly faced with difficult design decisions. Additionally, the number of options that a recommender systems developer has to consider continually grows over time with new innovations. The machine learning community is in a similar situation and has come together to tackle the problem. They invented concepts and tools to make machine learning development both easier and faster. These developments are categorized as automated machine learning (AutoML). As a result, the AutoML community formed and continuously innovates new approaches. Inspired by AutoML, the recommender systems community has recently understood the need for automation and sparsely introduced AutoRecSys. The goal of AutoRecSys is not to replace recommender systems developers but to improve performance through the automation of design decisions. With AutoRecSys, recommender systems engineers do not have to focus on easy but time-consuming tasks and are free to pursue difficult engineering tasks instead. Additionally, AutoRecSys enables easier access to recommender systems for beginners as it reduces the amount of knowledge required to get started with the development of recommender systems. AutoRecSys, like AutoML, is still early in its development and does not yet cover the whole development pipeline. Additionally, it is not yet clear, under which circumstances AutoML approaches can be transferred to recommender systems. Our research intends to close this gap by improving AutoRecSys both with regard to the transfer of AutoML and novel approaches. Furthermore, we focus specifically on the development of novel automation approaches for data processing and training. We note that the realization of AutoRecSys is going to be a community effort. Our part in this effort is to research AutoRecSys fundamentals, build practical tools for the community, raise awareness of the advantages of automation, and catalyze AutoRecSys development.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Wegmeth, Lukas; Vente, Tobias; Beel, Joeran
The Challenges of Algorithm Selection and Hyperparameter Optimization for Recommender Systems Journal Article
In: COSEAL Workshop 2023, 2023.
@article{Wegmeth2023b,
title = {The Challenges of Algorithm Selection and Hyperparameter Optimization for Recommender Systems},
author = {Lukas Wegmeth and Tobias Vente and Joeran Beel},
url = {http://dx.doi.org/10.13140/RG.2.2.24089.19049},
year = {2023},
date = {2023-01-01},
journal = {COSEAL Workshop 2023},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Wegmeth, Lukas; Vente, Tobias; Purucker, Lennart; Beel, Joeran
The Effect of Random Seeds for Data Splitting on Recommendation Accuracy Proceedings Article
In: Proceedings of the 3rd Perspectives on the Evaluation of Recommender Systems Workshop, 2023.
@inproceedings{Wegmeth2023,
title = {The Effect of Random Seeds for Data Splitting on Recommendation Accuracy},
author = {Lukas Wegmeth and Tobias Vente and Lennart Purucker and Joeran Beel},
url = {https://ceur-ws.org/Vol-3476/paper4.pdf},
year = {2023},
date = {2023-01-01},
booktitle = {Proceedings of the 3rd Perspectives on the Evaluation of Recommender Systems Workshop},
abstract = {The evaluation of recommender system algorithms depends on randomness, e.g., during randomly splitting data into training and testing data. We suspect that failing to account for randomness in this scenario may lead to misrepresenting the predictive accuracy of recommendation algorithms. To understand the community’s view of the importance of randomness, we conducted a paper study on 39 full papers published at the ACM RecSys 2022 conference. We found that the authors of 26 papers used some variation of a holdout split that requires a random seed. However, only five papers explicitly repeated experiments and averaged their results over different random seeds. This potentially problematic research practice motivated us to analyze the effect of data split random seeds on recommendation accuracy. Therefore, we train three common algorithms on nine public data sets with 20 data split random seeds, evaluate them on two ranking metrics with three different ranking cutoff values k, and compare the results. In the extreme case with k = 1, we show that depending on the data split random seed, the accuracy with traditional recommendation algorithms deviates by up to ∼6.3% from the mean accuracy achieved on the data set. Hence, we show that an algorithm may significantly over- or under-perform when maliciously or negligently selecting a random seed for splitting the data. To showcase a mitigation strategy and better research practice, we compare holdout to cross-validation and show that, again, for k = 1, the accuracy of algorithms evaluated with cross-validation deviates only up to ∼2.3% from the mean accuracy achieved on the data set. Furthermore, we found that the deviation becomes smaller the higher the value of k for both holdout and cross-validation.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2022
Wegmeth, Lukas; Beel, Joeran
CaMeLS: Cooperative Meta-Learning Service for Recommender Systems Proceedings Article
In: Proceedings of the 2nd Perspectives on the Evaluation of Recommender Systems Workshop, pp. 10–18, 2022.
@inproceedings{Wegmeth2022,
title = {CaMeLS: Cooperative Meta-Learning Service for Recommender Systems},
author = {Lukas Wegmeth and Joeran Beel},
url = {https://ceur-ws.org/Vol-3228/paper2.pdf},
year = {2022},
date = {2022-01-01},
booktitle = {Proceedings of the 2nd Perspectives on the Evaluation of Recommender Systems Workshop},
pages = {10–18},
abstract = {We present CaMeLS, a proof of concept of a cooperative meta-learning service for recommender systems. CaMeLS leverages the computing power of recommender systems users by uploading their metadata and algorithm evaluation scores to a centralized environment. Through the resulting database, CaMeLS then offers meta-learning services for everyone. Additionally, users may access evaluations of common data sets immediately to know the best-performing algorithms for those data sets. The metadata table may also be used for other purposes, eg, to perform benchmarks. In the initial version discussed in this paper, CaMeLS implements automatic algorithm selection through meta-learning over two recommender systems libraries. Automatic algorithm selection saves users time and computing power and does not require expertise, as the best algorithm is automatically found over multiple libraries. The CaMeLS database contains 20 metadata sets by default. We show that the automatic algorithm selection service is already on par with the single best algorithm in this default scenario. CaMeLS only requires a few seconds to predict a suitable algorithm, rather than potentially hours or days if performed manually, depending on the data set. The code is publicly available on our GitHub https://camels. recommender-systems.com.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Wegmeth, Lukas; Beel, Joeran
Cooperative Meta-Learning Service for Recommender Systems Journal Article
In: COSEAL Workshop 2022, 2022.
@article{Wegmeth2022a,
title = {Cooperative Meta-Learning Service for Recommender Systems},
author = {Lukas Wegmeth and Joeran Beel},
url = {http://dx.doi.org/10.13140/RG.2.2.10667.41768},
year = {2022},
date = {2022-01-01},
journal = {COSEAL Workshop 2022},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Wegmeth, Lukas
The Impact of Feature Quantity on Recommendation Algorithm Performance: A Movielens-100K Case Study Proceedings Article
In: arXiv:2207.08713, 2022.
@inproceedings{Wegmeth2022b,
title = {The Impact of Feature Quantity on Recommendation Algorithm Performance: A Movielens-100K Case Study},
author = {Lukas Wegmeth},
url = {https://arxiv.org/pdf/2207.08713.pdf},
year = {2022},
date = {2022-01-01},
booktitle = {arXiv:2207.08713},
abstract = {Recent model-based Recommender Systems (RecSys) algorithms emphasize on the use of features, also called side information, in their design similar to algorithms in Machine Learning (ML). In contrast, some of the most popular and traditional algorithms for RecSys solely focus on a given user-item-rating relation without including side information. An important category of these is matrix factorization-based algorithms, e.g., Singular Value Decomposition and Alternating Least Squares, which are known to have high performance on RecSys data sets. The goal of this case study is to provide a performance comparison and assessment of RecSys and ML algorithms when side information is included. We chose the Movielens-100K data set since it is a standard for comparing RecSys algorithms. We compared six different feature sets with varying quantities of features which were generated from the baseline data and evaluated on a total of 19 RecSys algorithms, baseline ML algorithms, Automated Machine Learning (AutoML) pipelines, and state-of-the-art RecSys algorithms that incorporate side information. The results show that additional features benefit all algorithms we evaluated. However, the correlation between feature quantity and performance is not monotonous for AutoML and RecSys. In these categories, an analysis of feature importance revealed that the quality of features matters more than quantity. Throughout our experiments, the average performance on the feature set with the lowest number of features is ∼6% worse compared to that with the highest in terms of the Root Mean Squared Error. An interesting observation is that AutoML outperforms matrix factorization-based RecSys algorithms when additional features are used. Almost all algorithms that can include side information have higher performance when using the highest quantity of features. In the other cases, the performance difference is negligible (<1%). The results show a clear positive trend for the effect of feature quantity as well as the important effects of feature quality on the evaluated algorithms.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}