Prof. Dr.-Ing. Joeran Beel
Professor of Intelligent Systems
Students, please read here before contacting me
Phone: +49 271 740-3593 (not for student inquiries)
Email: joeran.<last-name>@uni-siegen.de
Office: H-C 8315
Address: Office and postal address
Table of Contents
ToggleBio
Joeran Beel is head of the Intelligent Systems Group at the University of Siegen. His research focuses on automated machine learning & meta-learning, information retrieval, and recommender systems. He has published more than 140 peer-reviewed publications, is a member of the ACM Recommender Systems Steering Committee, an associate editor and the information director at ACM TORS, and he acted as a reviewer for venues such as SIGIR, ECIR, RecSys, UMAP, ACM TiiS, and JASIST. Joeran Beel has founded multiple award-winning business start-ups and has initiated and contributed to various projects, including Recommender-Systems.com, Docear, TensorFlow, and JabRef. He acquired over 2.5 million Euros in funding for his research and business start-ups.
News
Tweets by JoeranBeelShort CV
Since 2020
Professor of Intelligent Systems at the University of Siegen, Germany
2020-2022
Visiting Research Fellow at Trinity College Dublin, Ireland
2018-2022
Visiting Professor at the National Institute of Informatics (NII) Tokyo, Japan
2016-2020
Assistant Professor in Intelligent Systems at Trinity College Dublin, Ireland, and the ADAPT Centre
2016 (8 Months)
Postdoctoral Researcher at the National Institute of Informatics (NII) Tokyo, Japan, in the group of Prof. Akiko Aizawa.
2015-2016
Product Manager for APIs & Content at HRS Holidays, Munich
2009-2015
PhD (Dr.-Ing) in Computer Science at the University of Magdeburg, Germany, in the group of Prof. Dr. Andreas Nürnberger.
2011-2013
Business Start-Up Founder at Docear in Magdeburg, Germany
2009-2013 (Multiple Stays)
Visiting Researcher at the University of California, Berkeley, USA, in the groups of Prof. Jim Pitman and Prof. Erik Wilde.
2007
Dipl.-Wirt.-Informatik (equivalent to an MSc in Business Information Systems) at the University of Magdeburg, Germany
2006
MSc in Project Management at the Lancaster University Management School, UK
Publications
2024
Beel, Joeran
A Call for Evidence-based Best-Practices for Recommender Systems Evaluations Proceedings Article
In: Bauer, Christine; Said, Alan; Zangerle, Eva (Ed.): Report from Dagstuhl Seminar 24211: Evaluation Perspectives of Recommender Systems: Driving Research and Education, 2024.
@inproceedings{Beel2024d,
title = {A Call for Evidence-based Best-Practices for Recommender Systems Evaluations},
author = {Joeran Beel},
editor = {Christine Bauer and Alan Said and Eva Zangerle},
url = {https://isg.beel.org/pubs/2024_Call_for_Evidence_Based_RecSys_Evaluation__Pre_Print_.pdf},
doi = {10.31219/osf.io/djuac},
year = {2024},
date = {2024-01-01},
booktitle = {Report from Dagstuhl Seminar 24211: Evaluation Perspectives of Recommender Systems: Driving Research and Education},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Beel, Joeran; Jannach, Dietmar; Said, Alan; Shani, Guy; Vente, Tobias; Wegmeth, Lukas
Best-Practices for Offline Evaluations of Recommender Systems Proceedings Article
In: Bauer, Christine; Said, Alan; Zangerle, Eva (Ed.): Report from Dagstuhl Seminar 24211 – Evaluation Perspectives of Recommender Systems: Driving Research and Education, 2024.
@inproceedings{Beel2024,
title = {Best-Practices for Offline Evaluations of Recommender Systems},
author = {Joeran Beel and Dietmar Jannach and Alan Said and Guy Shani and Tobias Vente and Lukas Wegmeth},
editor = {Christine Bauer and Alan Said and Eva Zangerle},
year = {2024},
date = {2024-01-01},
booktitle = {Report from Dagstuhl Seminar 24211 – Evaluation Perspectives of Recommender Systems: Driving Research and Education},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Baumgart, Moritz; Wegmeth, Lukas; Vente, Tobias; Beel, Joeran
e-Fold Cross-Validation for Recommender-System Evaluation Proceedings Article
In: International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood) at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.
@inproceedings{Baumgart2024,
title = {e-Fold Cross-Validation for Recommender-System Evaluation},
author = {Moritz Baumgart and Lukas Wegmeth and Tobias Vente and Joeran Beel},
year = {2024},
date = {2024-01-01},
booktitle = {International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood) at the 18th ACM Conference on Recommender Systems (ACM RecSys)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Beel, Joeran; Wegmeth, Lukas; Vente, Tobias
E-fold Cross-validation: A Computing and Energy-efficient Alternative to K-fold Cross-validation with Adaptive Folds [Proposal] Journal Article
In: OSF Preprints, 2024.
@article{Beel2024c,
title = {E-fold Cross-validation: A Computing and Energy-efficient Alternative to K-fold Cross-validation with Adaptive Folds [Proposal]},
author = {Joeran Beel and Lukas Wegmeth and Tobias Vente},
url = {https://osf.io/preprints/osf/exw3j
https://isg.beel.org/blog/2024/04/07/e-fold-cross-validation-a-computing-and-energy-efficient-alternative-to-k-fold-cross-validation-with-adaptive-folds-proposal/},
doi = {10.31219/osf.io/exw3j},
year = {2024},
date = {2024-01-01},
journal = {OSF Preprints},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Wegmeth, Lukas; Vente, Tobias; Said, Alan; Beel, Joeran
EMERS: Energy Meter for Recommender Systems Proceedings Article
In: International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood) at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.
@inproceedings{Wegmeth2024a,
title = {EMERS: Energy Meter for Recommender Systems},
author = {Lukas Wegmeth and Tobias Vente and Alan Said and Joeran Beel},
url = {https://arxiv.org/pdf/2409.15060},
year = {2024},
date = {2024-01-01},
booktitle = {International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood) at the 18th ACM Conference on Recommender Systems (ACM RecSys)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Vente, Tobias; Wegmeth, Lukas; Said, Alan; Beel, Joeran
From Clicks to Carbon: The Environmental Toll of Recommender Systems Proceedings Article
In: Proceedings of the 18th ACM Conference on Recommender Systems, pp. 580–590, Association for Computing Machinery, Bari, Italy, 2024, ISBN: 9798400705052.
@inproceedings{Vente2024a,
title = {From Clicks to Carbon: The Environmental Toll of Recommender Systems},
author = {Tobias Vente and Lukas Wegmeth and Alan Said and Joeran Beel},
url = {https://arxiv.org/abs/2408.08203},
doi = {10.1145/3640457.3688074},
isbn = {9798400705052},
year = {2024},
date = {2024-01-01},
booktitle = {Proceedings of the 18th ACM Conference on Recommender Systems},
pages = {580–590},
publisher = {Association for Computing Machinery},
address = {Bari, Italy},
series = {RecSys '24},
abstract = {As global warming soars, the need to assess the environmental impact of research is becoming increasingly urgent. Despite this, few recommender systems research papers address their environmental impact. In this study, we estimate the environmental impact of recommender systems research by reproducing typical experimental pipelines. Our analysis spans 79 full papers from the 2013 and 2023 ACM RecSys conferences, comparing traditional “good old-fashioned AI’’ algorithms with modern deep learning algorithms. We designed and reproduced representative experimental pipelines for both years, measuring energy consumption with a hardware energy meter and converting it to CO2 equivalents. Our results show that papers using deep learning algorithms emit approximately 42 times more CO2 equivalents than papers using traditional methods. On average, a single deep learning-based paper generates 3,297 kilograms of CO2 equivalents—more than the carbon emissions of one person flying from New York City to Melbourne or the amount of CO2 one tree sequesters over 300 years.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Mahlich, Christopher; Vente, Tobias; Beel, Joeran
From Theory to Practice: Implementing and Evaluating e-Fold Cross-Validation Proceedings Article
In: International Conference on Artificial Intelligence and Machine Learning Research (CAIMLR), 2024.
@inproceedings{Mahlich2024,
title = {From Theory to Practice: Implementing and Evaluating e-Fold Cross-Validation},
author = {Christopher Mahlich and Tobias Vente and Joeran Beel},
url = {https://isg.beel.org/blog/2024/09/16/e-fold-cross-validation/},
year = {2024},
date = {2024-01-01},
booktitle = {International Conference on Artificial Intelligence and Machine Learning Research (CAIMLR)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Vente, Tobias; Mehta, Zainil; Wegmeth, Lukas; Beel, Joeran
Greedy Ensemble Selection for Top-N Recommendations Proceedings Article
In: RobustRecSys Workshop at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.
@inproceedings{Vente2024b,
title = {Greedy Ensemble Selection for Top-N Recommendations},
author = {Tobias Vente and Zainil Mehta and Lukas Wegmeth and Joeran Beel},
year = {2024},
date = {2024-01-01},
booktitle = {RobustRecSys Workshop at the 18th ACM Conference on Recommender Systems (ACM RecSys)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Beel, Joeran; Said, Alan; Vente, Tobias; Wegmeth, Lukas
Green Recommender Systems – A Call for Attention Journal Article
In: Recommender-Systems.com Blog, 2024.
@article{Beel2024e,
title = {Green Recommender Systems – A Call for Attention},
author = {Joeran Beel and Alan Said and Tobias Vente and Lukas Wegmeth},
url = {https://isg.beel.org/pubs/2024_Green_Recommender_Systems-A_Call_for_Attention.pdf},
doi = {10.31219/osf.io/5ru2g},
year = {2024},
date = {2024-01-01},
journal = {Recommender-Systems.com Blog},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Arabzadeh, Ardalan; Vente, Tobias; Beel, Joeran
Green Recommender Systems: Optimizing Dataset Size for Energy-Efficient Algorithm Performance Proceedings Article
In: International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood) at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.
@inproceedings{Arabzadeh2024,
title = {Green Recommender Systems: Optimizing Dataset Size for Energy-Efficient Algorithm Performance},
author = {Ardalan Arabzadeh and Tobias Vente and Joeran Beel},
year = {2024},
date = {2024-01-01},
booktitle = {International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood) at the 18th ACM Conference on Recommender Systems (ACM RecSys)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Beel, Joeran; Wegmeth, Lukas; Michiels, Lien; Schulz, Steffen
Informed Dataset Selection with ‘Algorithm Performance Spaces’ Proceedings Article
In: 18th ACM Conference on Recommender Systems, pp. 1085–1090, Association for Computing Machinery, Bari, Italy, 2024, ISBN: 9798400705052.
@inproceedings{Beel2024b,
title = {Informed Dataset Selection with ‘Algorithm Performance Spaces’},
author = {Joeran Beel and Lukas Wegmeth and Lien Michiels and Steffen Schulz},
url = {https://doi.org/10.1145/3640457.3691704
https://isg.beel.org/blog/2024/09/01/informed-dataset-selection-with-algorithm-performance-spaces/},
doi = {10.1145/3640457.3691704},
isbn = {9798400705052},
year = {2024},
date = {2024-01-01},
booktitle = {18th ACM Conference on Recommender Systems},
pages = {1085–1090},
publisher = {Association for Computing Machinery},
address = {Bari, Italy},
series = {RecSys '24},
abstract = {When designing recommender-systems experiments, a key question that has been largely overlooked is the choice of datasets. In a brief survey of ACM RecSys papers, we found that authors typically justified their dataset choices by labelling them as public, benchmark, or ‘real-world’ without further explanation. We propose the Algorithm Performance Space (APS) as a novel method for informed dataset selection. The APS is an n-dimensional space where each dimension represents the performance of a different algorithm. Each dataset is depicted as an n-dimensional vector, with greater distances indicating higher diversity. In our experiment, we ran 29 algorithms on 95 datasets to construct an actual APS. Our findings show that many datasets, including most Amazon datasets, are clustered closely in the APS, i.e. they are not diverse. However, other datasets, such as MovieLens and Docear, are more dispersed. The APS also enables the grouping of datasets based on the solvability of the underlying problem. Datasets in the top right corner of the APS are considered ’solved problems’ because all algorithms perform well on them. Conversely, datasets in the bottom left corner lack well-performing algorithms, making them ideal candidates for new recommender-system research due to the challenges they present.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Beel, Joeran
Our use of AI-tools for writing research papers Proceedings Article
In: Intelligent Systems Group, Blog, 2024.
@inproceedings{Beel2024a,
title = {Our use of AI-tools for writing research papers},
author = {Joeran Beel},
url = {https://isg.beel.org/blog/2024/08/19/our-use-of-ai-tools-for-writing-research-papers/},
year = {2024},
date = {2024-01-01},
booktitle = {Intelligent Systems Group, Blog},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Wegmeth, Lukas; Vente, Tobias; Beel, Joeran
Recommender Systems Algorithm Selection for Ranking Prediction on Implicit Feedback Datasets Proceedings Article
In: 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.
@inproceedings{Wegmeth2024,
title = {Recommender Systems Algorithm Selection for Ranking Prediction on Implicit Feedback Datasets},
author = {Lukas Wegmeth and Tobias Vente and Joeran Beel},
url = {https://arxiv.org/pdf/2409.05461},
year = {2024},
date = {2024-01-01},
booktitle = {18th ACM Conference on Recommender Systems (ACM RecSys)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Meister, Philipp; Wegmeth, Lukas; Vente, Tobias; Beel, Joeran
Removing Bad Influence: Identifying and Pruning Detrimental Users in Collaborative Filtering Recommender Systems Proceedings Article
In: RobustRecSys Workshop at the 18th ACM Conference on Recommender Systems (ACM RecSys), 2024.
@inproceedings{Meister2024,
title = {Removing Bad Influence: Identifying and Pruning Detrimental Users in Collaborative Filtering Recommender Systems},
author = {Philipp Meister and Lukas Wegmeth and Tobias Vente and Joeran Beel},
year = {2024},
date = {2024-01-01},
booktitle = {RobustRecSys Workshop at the 18th ACM Conference on Recommender Systems (ACM RecSys)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Vente, Tobias; Beel, Joeran
The Potential of AutoML for Recommender Systems Journal Article
In: arXiv, pp. 18, 2024.
@article{Vente2024,
title = {The Potential of AutoML for Recommender Systems},
author = {Tobias Vente and Joeran Beel},
url = {https://arxiv.org/abs/2402.04453},
doi = {10.48550/arXiv.2402.04453},
year = {2024},
date = {2024-01-01},
journal = {arXiv},
pages = {18},
abstract = {Automated Machine Learning (AutoML) has greatly advanced applications of Machine Learning (ML) including model compression, machine translation, and computer vision. Recommender Systems (RecSys) can be seen as an application of ML. Yet, AutoML has found little attention in the RecSys community; nor has RecSys found notable attention in the AutoML community. Only few and relatively simple Automated Recommender Systems (AutoRecSys) libraries exist that adopt AutoML techniques. However, these libraries are based on student projects and do not offer the features and thorough development of AutoML libraries. We set out to determine how AutoML libraries perform in the scenario of an inexperienced user who wants to implement a recommender system. We compared the predictive performance of 60 AutoML, AutoRecSys, ML, and RecSys algorithms from 15 libraries, including a mean predictor baseline, on 14 explicit feedback RecSys datasets. To simulate the perspective of an inexperienced user, the algorithms were evaluated with default hyperparameters. We found that AutoML and AutoRecSys libraries performed best. AutoML libraries performed best for six of the 14 datasets (43%), but it was not always the same AutoML library performing best. The single-best library was the AutoRecSys library Auto-Surprise, which performed best on five datasets (36%). On three datasets (21%), AutoML libraries performed poorly, and RecSys libraries with default parameters performed best. Although, while obtaining 50% of all placements in the top five per dataset, RecSys algorithms fall behind AutoML on average. ML algorithms generally performed the worst.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2023
Purucker, Lennart; Beel, Joeran
A first Look at Meta-Learning Algorithm Selection for Post Hoc Ensembling in AutoML Proceedings Article
In: Poster Track of the COSEAL Workshop, 2023.
@inproceedings{Purucker2023a,
title = {A first Look at Meta-Learning Algorithm Selection for Post Hoc Ensembling in AutoML},
author = {Lennart Purucker and Joeran Beel},
year = {2023},
date = {2023-01-01},
booktitle = {Poster Track of the COSEAL Workshop},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Purucker, Lennart; Beel, Joeran
CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure Proceedings Article
In: 2nd International Conference on Automated Machine Learning (AutoML), pp. 1–23, 2023.
@inproceedings{Purucker2023,
title = {CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure},
author = {Lennart Purucker and Joeran Beel},
url = {https://openreview.net/pdf?id=MeCwOxob8jfl},
year = {2023},
date = {2023-01-01},
booktitle = {2nd International Conference on Automated Machine Learning (AutoML)},
pages = {1–23},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Vente, Tobias; Ekstrand, Michael; Beel, Joeran
Introducing LensKit-Auto, an Experimental Automated Recommender System (AutoRecSys) Toolkit Proceedings Article
In: Proceedings of the 17th ACM Conference on Recommender Systems, pp. 1212-1216, 2023.
@inproceedings{Vente2023a,
title = {Introducing LensKit-Auto, an Experimental Automated Recommender System (AutoRecSys) Toolkit},
author = {Tobias Vente and Michael Ekstrand and Joeran Beel},
url = {https://dl.acm.org/doi/10.1145/3604915.3610656},
year = {2023},
date = {2023-01-01},
booktitle = {Proceedings of the 17th ACM Conference on Recommender Systems},
pages = {1212-1216},
abstract = {LensKit is one of the first and most popular Recommender System libraries. While LensKit offers a wide variety of features, it does not include any optimization strategies or guidelines on how to select and tune LensKit algorithms. LensKit developers have to manually include third-party libraries into their experimental setup or implement optimization strategies by hand to optimize hyperparameters. We found that 63.6% (21 out of 33) of papers using LensKit algorithms for their experiments did not select algorithms or tune hyperparameters. Non-optimized models represent poor baselines and produce less meaningful research results. This demo introduces LensKit-Auto. LensKit-Auto automates the entire Recommender System pipeline and enables LensKit developers to automatically select, optimize, and ensemble LensKit algorithms.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Purucker, Lennart; Schneider, Lennart; Anastacio, Marie; Beel, Joeran; Bischl, Bernd; Holger, Hoos
Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML Proceedings Article
In: 2nd International Conference on Automated Machine Learning (AutoML), pp. 1–34, 2023.
@inproceedings{Purucker2023b,
title = {Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML},
author = {Lennart Purucker and Lennart Schneider and Marie Anastacio and Joeran Beel and Bernd Bischl and Hoos Holger},
url = {https://openreview.net/pdf?id=zvV7hemQmtLl},
year = {2023},
date = {2023-01-01},
booktitle = {2nd International Conference on Automated Machine Learning (AutoML)},
pages = {1–34},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Beel, Joeran; Breuer, Timo; Crescenzi, Anita; Fuhr, Norbert; Li, Meije
Results-blind Reviewing Proceedings Article
In: Bauer, Christine; Carterette, Ben; Ferro, Nicola; Fuhr, Norbert; Faggioli, Guglielmos (Ed.): Frontiers of Information Access Experimentation for Research and Education (Dagstuhl Seminar 23031), pp. 68-154, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023.
@inproceedings{Beel2023,
title = {Results-blind Reviewing},
author = {Joeran Beel and Timo Breuer and Anita Crescenzi and Norbert Fuhr and Meije Li},
editor = {Christine Bauer and Ben Carterette and Nicola Ferro and Norbert Fuhr and Guglielmos Faggioli},
url = {https://isg.beel.org/pubs/2023-Results-Blind-Reviewing-Beel-et-al.pdf},
doi = {10.4230/DagRep.13.1.68},
year = {2023},
date = {2023-01-01},
booktitle = {Frontiers of Information Access Experimentation for Research and Education (Dagstuhl Seminar 23031)},
volume = {13},
number = {1},
pages = {68-154},
publisher = {Schloss Dagstuhl - Leibniz-Zentrum für Informatik},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Wegmeth, Lukas; Vente, Tobias; Beel, Joeran
The Challenges of Algorithm Selection and Hyperparameter Optimization for Recommender Systems Journal Article
In: COSEAL Workshop 2023, 2023.
@article{Wegmeth2023b,
title = {The Challenges of Algorithm Selection and Hyperparameter Optimization for Recommender Systems},
author = {Lukas Wegmeth and Tobias Vente and Joeran Beel},
url = {http://dx.doi.org/10.13140/RG.2.2.24089.19049},
year = {2023},
date = {2023-01-01},
journal = {COSEAL Workshop 2023},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Wegmeth, Lukas; Vente, Tobias; Purucker, Lennart; Beel, Joeran
The Effect of Random Seeds for Data Splitting on Recommendation Accuracy Proceedings Article
In: Proceedings of the 3rd Perspectives on the Evaluation of Recommender Systems Workshop, 2023.
@inproceedings{Wegmeth2023,
title = {The Effect of Random Seeds for Data Splitting on Recommendation Accuracy},
author = {Lukas Wegmeth and Tobias Vente and Lennart Purucker and Joeran Beel},
url = {https://ceur-ws.org/Vol-3476/paper4.pdf},
year = {2023},
date = {2023-01-01},
booktitle = {Proceedings of the 3rd Perspectives on the Evaluation of Recommender Systems Workshop},
abstract = {The evaluation of recommender system algorithms depends on randomness, e.g., during randomly splitting data into training and testing data. We suspect that failing to account for randomness in this scenario may lead to misrepresenting the predictive accuracy of recommendation algorithms. To understand the community’s view of the importance of randomness, we conducted a paper study on 39 full papers published at the ACM RecSys 2022 conference. We found that the authors of 26 papers used some variation of a holdout split that requires a random seed. However, only five papers explicitly repeated experiments and averaged their results over different random seeds. This potentially problematic research practice motivated us to analyze the effect of data split random seeds on recommendation accuracy. Therefore, we train three common algorithms on nine public data sets with 20 data split random seeds, evaluate them on two ranking metrics with three different ranking cutoff values k, and compare the results. In the extreme case with k = 1, we show that depending on the data split random seed, the accuracy with traditional recommendation algorithms deviates by up to ∼6.3% from the mean accuracy achieved on the data set. Hence, we show that an algorithm may significantly over- or under-perform when maliciously or negligently selecting a random seed for splitting the data. To showcase a mitigation strategy and better research practice, we compare holdout to cross-validation and show that, again, for k = 1, the accuracy of algorithms evaluated with cross-validation deviates only up to ∼2.3% from the mean accuracy achieved on the data set. Furthermore, we found that the deviation becomes smaller the higher the value of k for both holdout and cross-validation.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2022
Purucker, Lennart Oswald; Beel, Joeran
Assembled-OpenML: Creating Efficient Benchmarks for Ensembles in AutoML with OpenML Proceedings Article
In: International Conference on Automated Machine Learning, Late-Breaking Results Track, pp. 1–18, 2022.
@inproceedings{Purucker2022,
title = {Assembled-OpenML: Creating Efficient Benchmarks for Ensembles in AutoML with OpenML},
author = {Lennart Oswald Purucker and Joeran Beel},
url = {https://2022.automl.cc/wp-content/uploads/2022/08/assembled_openml_creating_effi-Main-Paper-And-Supplementary-Material.pdf},
year = {2022},
date = {2022-01-01},
booktitle = {International Conference on Automated Machine Learning, Late-Breaking Results Track},
pages = {1–18},
abstract = {Automated Machine Learning (AutoML) frameworks regularly use ensembles. Developers need to compare different ensemble techniques to select appropriate techniques for an AutoML framework from the many potential techniques. So far, the comparison of ensemble techniques is often computationally expensive, because many base models must be trained and evaluated one or multiple times. Therefore, we present Assembled-OpenML. Assembled-OpenML is a Python tool, which builds meta-datasets for ensembles using OpenML. A meta-dataset, called Metatask, consists of the data of an OpenML task, the task's dataset, and prediction data from model evaluations for the task. We can make the comparison of ensemble techniques computationally cheaper by using the predictions stored in a metatask instead of training and evaluating base models. To introduce Assembled-OpenML, we describe the first version of our tool. Moreover, we present an example of using Assembled-OpenML to compare a set of ensemble techniques. For this example comparison, we built a benchmark using Assembled-OpenML and implemented ensemble techniques expecting predictions instead of base models as input. In our example comparison, we gathered the prediction data of 1523 base models for 31 datasets. Obtaining the prediction data for all base models using Assembled-OpenML took ∼1 hour in total. In comparison, obtaining the prediction data by training and evaluating just one base model on the most computationally expensive dataset took ∼37 minutes.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Wegmeth, Lukas; Beel, Joeran
CaMeLS: Cooperative Meta-Learning Service for Recommender Systems Proceedings Article
In: Proceedings of the 2nd Perspectives on the Evaluation of Recommender Systems Workshop, 2022.
@inproceedings{Wegmeth2022,
title = {CaMeLS: Cooperative Meta-Learning Service for Recommender Systems},
author = {Lukas Wegmeth and Joeran Beel},
url = {https://ceur-ws.org/Vol-3228/paper2.pdf},
year = {2022},
date = {2022-01-01},
booktitle = {Proceedings of the 2nd Perspectives on the Evaluation of Recommender Systems Workshop},
abstract = {We present CaMeLS, a proof of concept of a cooperative meta-learning service for recommender systems. CaMeLS leverages the computing power of recommender systems users by uploading their metadata and algorithm evaluation scores to a centralized environment. Through the resulting database, CaMeLS then offers meta-learning services for everyone. Additionally, users may access evaluations of common data sets immediately to know the best-performing algorithms for those data sets. The metadata table may also be used for other purposes, eg, to perform benchmarks. In the initial version discussed in this paper, CaMeLS implements automatic algorithm selection through meta-learning over two recommender systems libraries. Automatic algorithm selection saves users time and computing power and does not require expertise, as the best algorithm is automatically found over multiple libraries. The CaMeLS database contains 20 metadata sets by default. We show that the automatic algorithm selection service is already on par with the single best algorithm in this default scenario. CaMeLS only requires a few seconds to predict a suitable algorithm, rather than potentially hours or days if performed manually, depending on the data set. The code is publicly available on our GitHub https://camels. recommender-systems.com.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Wegmeth, Lukas; Beel, Joeran
Cooperative Meta-Learning Service for Recommender Systems Journal Article
In: COSEAL Workshop 2022, 2022.
@article{Wegmeth2022a,
title = {Cooperative Meta-Learning Service for Recommender Systems},
author = {Lukas Wegmeth and Joeran Beel},
url = {http://dx.doi.org/10.13140/RG.2.2.10667.41768},
year = {2022},
date = {2022-01-01},
journal = {COSEAL Workshop 2022},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Vente, Tobias; Purucker, Lennart; Beel, Joeran
The Feasibility of Greedy Ensemble Selection for Automated Recommender Systems Journal Article
In: COSEAL Workshop 2022, 2022.
@article{Vente2022,
title = {The Feasibility of Greedy Ensemble Selection for Automated Recommender Systems},
author = {Tobias Vente and Lennart Purucker and Joeran Beel},
url = {http://dx.doi.org/10.13140/RG.2.2.16277.29921},
year = {2022},
date = {2022-01-01},
journal = {COSEAL Workshop 2022},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2021
Buskulic, Nathan; Bergman, Edward; Beel, Joeran
Online Neural Architecture Search (ONAS): Adapting neural network architecture search in a continuously evolving domain. [Proposal] Journal Article
In: https://osf.io/suqxr, pp. 1-4, 2021.
@article{Buskulic2021,
title = {Online Neural Architecture Search (ONAS): Adapting neural network architecture search in a continuously evolving domain. [Proposal]},
author = {Nathan Buskulic and Edward Bergman and Joeran Beel},
url = {https://osf.io/suqxr
xxx2021-Online-Neural-Architecture-Search-(ONAS).pdf},
doi = {10.31219/osf.io/suqxr},
year = {2021},
date = {2021-01-01},
journal = {https://osf.io/suqxr},
pages = {1-4},
publisher = {OSF Preprint},
abstract = {Neural Architecture Search research has been limited to fixed datasets and as such does not provide the flexibility needed to deal with real-world, constantly evolving data. This is why we propose the basis of Online Neural Architecture Search (ONAS) to deal with complex, evolving, data distributions. We formalise ONAS as a minimisation problem upon which both the weights and the architecture of the neural network needs to be optimised for the data up until a time $t_i$. To solve this problem, we adapt a DARTS optimisation process, associated with an early stopping scheme, by using the supernet optimised on previous data as a warm-up initial state. This allows the architecture of the neural network to evolve as the data distribution evolves while limiting the computational burden. This work aims at building the initial mathematical formalism of the problem as well as the development of a framework where NAS methods could be used to solve this problem. Finally, several possible next steps are presented to show the potential of this field of Online Neural Architecture Search.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Al-Rawi, Mohammed; Beel, Joeran
Probabilistic Color Modelling of Clothing Items Proceedings Article
In: Dokoohaki, Jaradat N. (Ed.): Recommender Systems in Fashion and Retail, pp. 21–40, Springer, 2021.
@inproceedings{AlRawi2021a,
title = {Probabilistic Color Modelling of Clothing Items},
author = {Mohammed Al-Rawi and Joeran Beel},
editor = {Jaradat N. Dokoohaki},
url = {https://link.springer.com/chapter/10.1007/978-3-030-66103-8_2},
doi = {10.1007/978-3-030-66103-8_2},
year = {2021},
date = {2021-01-01},
booktitle = {Recommender Systems in Fashion and Retail},
volume = {734},
pages = {21–40},
publisher = {Springer},
series = {Lecture Notes in Electrical Engineering book series},
abstract = {Color modelling and extraction is an important topic in fashion. It can help build a wide range of applications, for example, recommender systems, color-based retrieval, fashion design, etc. We aim to develop and test models that can extract the dominant colors of clothing and accessory items. The approach we propose has three stages: (1) Mask-RCNN to segment the clothing items, (2) cluster the colors into a predefined number of groups, and (3) combine the detected colors based on the hue scores and the probability of each score. We use Clothing Co-Parsing and ModaNet datasets for evaluation. We also scrape fashion images from the WWW and use our models to discover the fashion color trend. Subjectively, we were able to extract colors even when clothing items have multiple colors. Moreover, we are able to extract colors along with the probability of them appearing in clothes. The method can provide the color baseline drive for more advanced fashion systems.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Beel, Joeran; Dixon, Haley
The ‘Unreasonable’ Effectiveness of Graphical User Interfaces for Recommender Systems Proceedings Article
In: Adjunct Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization, pp. 22–28, Association for Computing Machinery, New York, NY, USA, 2021, ISBN: 9781450383677.
@inproceedings{Beel2021,
title = {The ‘Unreasonable’ Effectiveness of Graphical User Interfaces for Recommender Systems},
author = {Joeran Beel and Haley Dixon},
url = {https://dl.acm.org/doi/fullHtml/10.1145/3450614.3461682
https://www.um.org/umap2021/12-posters/25-the-unreasonable-effectiveness-of-graphical-user-interfaces-for-recommender-systems.html
https://www.researchgate.net/publication/352672485_The_'Unreasonable'_Effectiveness_of_Graphical_User_Interfaces_for_Recommender_Systems},
doi = {10.1145/3450614.3461682},
isbn = {9781450383677},
year = {2021},
date = {2021-01-01},
booktitle = {Adjunct Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization},
pages = {22–28},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
abstract = {The impact of Graphical User Interfaces (GUI) for recommender systems is a little explored area. Therefore, we conduct an empirical study in which we create, deploy, and evaluate seven different GUI variations. We use these variations to display 68.260 related-blog-post recommendations to 10.595 unique visitors of our blog. The study shows that the GUIs have a strong effect on the recommender systems’ performance, measured in click-through rate (CTR). The best performing GUI achieved a 66% higher CTR than the worst performing GUI (statist. significant with p<0.05). In other words, with a few days of work to develop different GUIs, a recommender-system operator could increase CTR notably – maybe even more than by tuning the recommendation algorithm. In analogy to the ‘unreasonable effectiveness of data’ discussion by Google and others, we conclude that the effectiveness of graphical user interfaces for recommender systems is equally ‘unreasonable’. Hence, the recommender system community should spend more time on researching GUIs for recommender systems. In addition, we conduct a survey and find that the ACM Recommender Systems Conference has a strong focus on algorithms – 81% of all short and full papers published in 2019 and 2020 relate to algorithm development, and none to GUIs for recommender systems. We also surveyed the recommender systems of 50 blogs. While most displayed a thumbnail (86%) and had a mouseover interaction (62%) other design elements were rare. Only few highlighted top recommendations (8%), displayed rankings or relevance scores (6%), or offered a ‘view more’ option (4%).},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Scheidt, Teresa; Beel, Joeran
Time-dependent Evaluation of Recommender Systems Proceedings Article
In: Perspectives on the Evaluation of Recommender Systems Workshop, ACM RecSys Conference, 2021.
@inproceedings{Scheidt2021,
title = {Time-dependent Evaluation of Recommender Systems},
author = {Teresa Scheidt and Joeran Beel},
url = {https://ceur-ws.org/Vol-2955/paper10.pdf},
year = {2021},
date = {2021-01-01},
booktitle = {Perspectives on the Evaluation of Recommender Systems Workshop, ACM RecSys Conference},
abstract = {Evaluation of recommender systems is an actively discussed topic in the recommender system community. However, some aspects of evaluation have received little to no attention, one of them being whether evaluating recommender system algorithms with single-number metrics is sufficient. When presenting results as a single number, the only possible assumption is a stable performance over time regardless of changes in the datasets, while it intuitively seems more likely that the performance changes over time. We suggest presenting results over time, making it possible to identify trends and changes in performance as the dataset grows and changes. In this paper, we conduct an analysis of 6 algorithms on 10 datasets over time to identify the need for a time-dependent evaluation. To enable this evaluation over time, we split the datasets based on the provided timesteps into smaller subsets. At every tested timepoint we use all available data up to this timepoint, simulating a growing dataset as encountered in the realworld. Our results show that for 90% of the datasets the performance changes over time and in 60% even the ranking of algorithms changes over time.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2020
Grennan, Mark; Beel, Joeran
In: Proceedings of the 8th International Workshop on Mining Scientific Publications, pp. 27–35, Association for Computational Linguistics, Wuhan, China, 2020.
@inproceedings{Grennan2020,
title = {Synthetic vs. Real Reference Strings for Citation Parsing, and the Importance of Re-training and Out-Of-Sample Data for Meaningful Evaluations: Experiments with GROBID, GIANT and CORA},
author = {Mark Grennan and Joeran Beel},
url = {https://aclanthology.org/2020.wosp-1.4.pdf
https://www.aclweb.org/anthology/2020.wosp-1.4},
year = {2020},
date = {2020-08-01},
booktitle = {Proceedings of the 8th International Workshop on Mining Scientific Publications},
pages = {27–35},
publisher = {Association for Computational Linguistics},
address = {Wuhan, China},
abstract = {Citation parsing, particularly with deep neural networks, suffers
from a lack of training data as available datasets typically contain
only a few thousand training instances. Manually labelling citation
strings is very time-consuming, hence synthetically created training
data could be a solution. However, as of now, it is unknown if synthetically
created reference-strings are suitable to train machine learning
algorithms for citation parsing. To find out, we train Grobid, which
uses Conditional Random Fields, with a) human-labelled reference
strings from `real' bibliographies and b) synthetically created
reference strings from the GIANT dataset. We find that both synthetic
and organic reference strings are equally suited for training Grobid (F1 = 0.74). We additionally find that retraining Grobid has a notable
impact on its performance, for both synthetic and real data (+30%
in F1). Having as many types of labelled fields as possible during
training also improves effectiveness, even if these fields are not
available in the evaluation data (+13.5% F1). We conclude that
synthetic data is suitable for training (deep) citation parsing models.
We further suggest that in future evaluations of reference parsers
both evaluation data similar and dissimilar to the training data
should be used for more meaningful evaluations.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
from a lack of training data as available datasets typically contain
only a few thousand training instances. Manually labelling citation
strings is very time-consuming, hence synthetically created training
data could be a solution. However, as of now, it is unknown if synthetically
created reference-strings are suitable to train machine learning
algorithms for citation parsing. To find out, we train Grobid, which
uses Conditional Random Fields, with a) human-labelled reference
strings from `real' bibliographies and b) synthetically created
reference strings from the GIANT dataset. We find that both synthetic
and organic reference strings are equally suited for training Grobid (F1 = 0.74). We additionally find that retraining Grobid has a notable
impact on its performance, for both synthetic and real data (+30%
in F1). Having as many types of labelled fields as possible during
training also improves effectiveness, even if these fields are not
available in the evaluation data (+13.5% F1). We conclude that
synthetic data is suitable for training (deep) citation parsing models.
We further suggest that in future evaluations of reference parsers
both evaluation data similar and dissimilar to the training data
should be used for more meaningful evaluations.
Marwah, Divyanshu; Beel, Joeran
Term-Recency for TF-IDF, BM25 and USE Term Weighting Proceedings Article
In: Proceedings of the 8th International Workshop on Mining Scientific Publications, pp. 36–41, Association for Computational Linguistics, Wuhan, China, 2020.
@inproceedings{Marwah2020,
title = {Term-Recency for TF-IDF, BM25 and USE Term Weighting},
author = {Divyanshu Marwah and Joeran Beel},
url = {https://aclanthology.org/2020.wosp-1.5.pdf
https://www.aclweb.org/anthology/2020.wosp-1.5},
year = {2020},
date = {2020-08-01},
booktitle = {Proceedings of the 8th International Workshop on Mining Scientific Publications},
pages = {36–41},
publisher = {Association for Computational Linguistics},
address = {Wuhan, China},
abstract = {Effectiveness of a recommendation in an Information Retrieval (IR)
system is determined by relevancy scores of retrieved results. Term
weighting is responsible for computing the relevance scores and consequently
differentiating between the terms in a document. However, the current
term weighting formula (TF-IDF, for instance), weighs terms only
based on term frequency and inverse document frequency irrespective
of other important factors. This results in ambiguity in cases when
both TF and IDF values the same for more than one document, hence
resulting in same TF-IDF values. In this paper, we propose a modification
of TF-IDF and other term-weighting schemes that weighs the terms
based on the recency and the usage in the corpus. We have tested
the performance of our algorithm with existing term weighting schemes;
TF-IDF, BM25 and USE text embedding model. We have indexed three
different datasets with different domains to validate the premises
for our algorithm. On evaluating the algorithms using Precision,
Recall, F1 score, and NDCG, we found that time normalized TF-IDF
outperformed the classic TF-IDF with a significant difference in
all the metrics and datasets. Time-based USE model performed better
than the standard USE model in two out of three datasets. But the
time-based BM25 model did not perform well in some of the input queries
as compared to standard BM25 model.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
system is determined by relevancy scores of retrieved results. Term
weighting is responsible for computing the relevance scores and consequently
differentiating between the terms in a document. However, the current
term weighting formula (TF-IDF, for instance), weighs terms only
based on term frequency and inverse document frequency irrespective
of other important factors. This results in ambiguity in cases when
both TF and IDF values the same for more than one document, hence
resulting in same TF-IDF values. In this paper, we propose a modification
of TF-IDF and other term-weighting schemes that weighs the terms
based on the recency and the usage in the corpus. We have tested
the performance of our algorithm with existing term weighting schemes;
TF-IDF, BM25 and USE text embedding model. We have indexed three
different datasets with different domains to validate the premises
for our algorithm. On evaluating the algorithms using Precision,
Recall, F1 score, and NDCG, we found that time normalized TF-IDF
outperformed the classic TF-IDF with a significant difference in
all the metrics and datasets. Time-based USE model performed better
than the standard USE model in two out of three datasets. But the
time-based BM25 model did not perform well in some of the input queries
as compared to standard BM25 model.
Molloy, Paul; Beel, Joeran; Aizawa, Akiko
Virtual Citation Proximity (VCP): Empowering Document Recommender Systems by Learning a Hypothetical In-Text Citation-Proximity Metric for Uncited Documents Proceedings Article
In: Proceedings of the 8th International Workshop on Mining Scientific Publications, pp. 1–8, Association for Computational Linguistics, Wuhan, China, 2020.
@inproceedings{Molloy2020,
title = {Virtual Citation Proximity (VCP): Empowering Document Recommender Systems by Learning a Hypothetical In-Text Citation-Proximity Metric for Uncited Documents},
author = {Paul Molloy and Joeran Beel and Akiko Aizawa},
url = {https://www.aclweb.org/anthology/2020.wosp-1.1},
year = {2020},
date = {2020-08-01},
booktitle = {Proceedings of the 8th International Workshop on Mining Scientific Publications},
pages = {1–8},
publisher = {Association for Computational Linguistics},
address = {Wuhan, China},
abstract = {The relatedness of research articles, patents, court rulings, web
pages, and other document types is often calculated with citation
or hyperlink-based approaches like co-citation (proximity) analysis.
The main limitation of citation-based approaches is that they cannot
be used for documents that receive little or no citations. We propose
Virtual Citation Proximity (VCP), a Siamese Neural Network architecture,
which combines the advantages of co-citation proximity analysis (diverse
notions of relatedness / high recommendation performance), with the
advantage of content-based filtering (high coverage). VCP is trained
on a corpus of documents with textual features, and with real citation
proximity as ground truth. VCP then predicts for any two documents,
based on their title and abstract, in what proximity the two documents
would be co-cited, if they were indeed co-cited. The prediction can
be used in the same way as real citation proximity to calculate document
relatedness, even for uncited documents. In our evaluation with 2
million co-citations from Wikipedia articles, VCP achieves an MAE
of 0.0055, i.e. an improvement of 20% over the baseline, though
the learning curve suggests that more work is needed.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
pages, and other document types is often calculated with citation
or hyperlink-based approaches like co-citation (proximity) analysis.
The main limitation of citation-based approaches is that they cannot
be used for documents that receive little or no citations. We propose
Virtual Citation Proximity (VCP), a Siamese Neural Network architecture,
which combines the advantages of co-citation proximity analysis (diverse
notions of relatedness / high recommendation performance), with the
advantage of content-based filtering (high coverage). VCP is trained
on a corpus of documents with textual features, and with real citation
proximity as ground truth. VCP then predicts for any two documents,
based on their title and abstract, in what proximity the two documents
would be co-cited, if they were indeed co-cited. The prediction can
be used in the same way as real citation proximity to calculate document
relatedness, even for uncited documents. In our evaluation with 2
million co-citations from Wikipedia articles, VCP achieves an MAE
of 0.0055, i.e. an improvement of 20% over the baseline, though
the learning curve suggests that more work is needed.
Tyrrell, Bryan; Bergman, Edward; Jones, Gareth; Beel, Joeran
‘Algorithm-Performance Personas’ for Siamese Meta-Learning and Automated Algorithm Selection Proceedings Article
In: 7th ICML Workshop on Automated Machine Learning, pp. 1–16, 2020.
@inproceedings{Tyrrell2020,
title = {‘Algorithm-Performance Personas’ for Siamese Meta-Learning and Automated Algorithm Selection},
author = {Bryan Tyrrell and Edward Bergman and Gareth Jones and Joeran Beel},
url = {https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_48.pdf},
year = {2020},
date = {2020-01-01},
booktitle = {7th ICML Workshop on Automated Machine Learning},
pages = {1–16},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Gupta, Srijan; Beel, Joeran
Auto-CaseRec: Automatically Selecting and Optimizing Recommendation-Systems Algorithms Journal Article
In: OSF Preprints DOI:10.31219/osf.io/4znmd,, 2020.
@article{Gupta2020,
title = {Auto-CaseRec: Automatically Selecting and Optimizing Recommendation-Systems Algorithms},
author = {Srijan Gupta and Joeran Beel},
doi = {10.31219/osf.io/4znmd},
year = {2020},
date = {2020-01-01},
journal = {OSF Preprints DOI:10.31219/osf.io/4znmd,},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Anand, Rohan; Beel, Joeran
Auto-Surprise: An Automated Recommender-System (AutoRecSys) Library with Tree of Parzens Estimator (TPE) Optimization Proceedings Article
In: 14th ACM Conference on Recommender Systems (RecSys), pp. 1–4, 2020.
@inproceedings{Anand2020,
title = {Auto-Surprise: An Automated Recommender-System (AutoRecSys) Library with Tree of Parzens Estimator (TPE) Optimization},
author = {Rohan Anand and Joeran Beel},
url = {https://arxiv.org/abs/2008.13532},
year = {2020},
date = {2020-01-01},
booktitle = {14th ACM Conference on Recommender Systems (RecSys)},
pages = {1–4},
abstract = {We introduce Auto-Surprise, an Automated Recommender System library. Auto-Surprise is an extension of the Surprise recommender system library and eases the algorithm selection and configuration process. Compared to out-of-the-box Surprise library, Auto-Surprise performs better when evaluated with MovieLens, Book Crossing and Jester Datasets. It may also result in the selection of an algorithm with significantly lower runtime. Compared to Surprise's grid search, Auto-Surprise performs equally well or slightly better in terms of RMSE, and is notably faster in finding the optimum hyperparameters.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Arambakam, Mukesh; Beel, Joeran
Federated Meta-Learning: Democratizing Algorithm Selection Across Disciplines and Software Libraries Proceedings Article
In: 7th ICML Workshop on Automated Machine Learning, pp. 1–8, 2020.
@inproceedings{Arambakam2020,
title = {Federated Meta-Learning: Democratizing Algorithm Selection Across Disciplines and Software Libraries},
author = {Mukesh Arambakam and Joeran Beel},
url = {https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_39.pdf},
year = {2020},
date = {2020-01-01},
booktitle = {7th ICML Workshop on Automated Machine Learning},
pages = {1–8},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Carroll, Oisín; Beel, Joeran
Finite Group Equivariant Neural Networks for Games Journal Article
In: arXiv, no. 2009.05027, pp. 1–8, 2020.
@article{Carroll2020,
title = {Finite Group Equivariant Neural Networks for Games},
author = {Oisín Carroll and Joeran Beel},
url = {https://arxiv.org/abs/2009.05027},
year = {2020},
date = {2020-01-01},
journal = {arXiv},
number = {2009.05027},
pages = {1–8},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Collins, Andrew; Tierney, Laura; Beel, Joeran
Per-Instance Algorithm Selection for Recommender Systems via Instance Clustering Journal Article
In: arXiv, no. 2012.15151, 2020.
@article{Collins2020,
title = {Per-Instance Algorithm Selection for Recommender Systems via Instance Clustering},
author = {Andrew Collins and Laura Tierney and Joeran Beel},
url = {https://browse.arxiv.org/pdf/2012.15151.pdf},
year = {2020},
date = {2020-01-01},
journal = {arXiv},
number = {2012.15151},
abstract = {Recommendation algorithms perform differently if the users, recommendation contexts, applications, and user interfaces vary even slightly. It is similarly observed in other fields, such as combinatorial problem solving, that algorithms perform differently for each instance presented. In those fields, meta-learning is successfully used to predict an optimal algorithm for each instance, to improve overall system performance. Per-instance algorithm selection has thus far been unsuccessful for recommender systems. In this paper we propose a per-instance meta-learner that clusters data instances and predicts the best algorithm for unseen instances according to cluster membership. We test our approach using 10 collaborative- and 4 content-based filtering algorithms, for varying clustering parameters, and find a significant improvement over the best performing base algorithm at alpha=0.053 (MAE: 0.7107 vs LightGBM 0.7214; t-test). We also explore the performances of our base algorithms on a ratings dataset and empirically show that the error of a perfect algorithm selector monotonically decreases for larger pools of algorithm. To the best of our knowledge, this is the first effective meta-learning technique for per-instance algorithm selection in recommender systems.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Beel, Joeran
Recommender-Systems.Com: A Central Platform for the Recommender-System Community Proceedings Article
In: Fourteenth ACM Conference on Recommender Systems, pp. 600–603, Association for Computing Machinery, Virtual Event, Brazil, 2020, ISBN: 9781450375832.
@inproceedings{Beel2020,
title = {Recommender-Systems.Com: A Central Platform for the Recommender-System Community},
author = {Joeran Beel},
url = {https://doi.org/10.1145/3383313.3411522},
doi = {10.1145/3383313.3411522},
isbn = {9781450375832},
year = {2020},
date = {2020-01-01},
booktitle = {Fourteenth ACM Conference on Recommender Systems},
pages = {600–603},
publisher = {Association for Computing Machinery},
address = {Virtual Event, Brazil},
series = {RecSys '20},
abstract = {We introduce Recommender-Systems.com (RS_c) as a central platform
for the recommender-systems community. RS_c provides regular news
on important events in the community as well as curated lists of
recommender-system resources including datasets, algorithms, jobs,
software, and learning materials. Based on a survey with 28 participants
– mostly authors at the RecSys 2019 conference – 91% agree that RS_c
could be a major contribution to the community. Participants consider
it currently particularly difficult to find best practice guidelines
(45%); researchers, freelancers and employers (45%); and curated
lists of state-of-the-art algorithms, software, and datasets (36%).
Notably, only 19% consider it (very) easy to find material relating
to diversity, equality and anti-discrimination.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
for the recommender-systems community. RS_c provides regular news
on important events in the community as well as curated lists of
recommender-system resources including datasets, algorithms, jobs,
software, and learning materials. Based on a survey with 28 participants
– mostly authors at the RecSys 2019 conference – 91% agree that RS_c
could be a major contribution to the community. Participants consider
it currently particularly difficult to find best practice guidelines
(45%); researchers, freelancers and employers (45%); and curated
lists of state-of-the-art algorithms, software, and datasets (36%).
Notably, only 19% consider it (very) easy to find material relating
to diversity, equality and anti-discrimination.
Beel, Joeran; Tyrell, Bryan; Bergman, Edward; Collins, Andrew; Nagoor, Shahad
Siamese Meta-Learning and Algorithm Selection with ‘Algorithm-Performance Personas’ [Proposal] Journal Article
In: arXiv:2006.12328 [cs.LG], 2020.
@article{Beel2020b,
title = {Siamese Meta-Learning and Algorithm Selection with ‘Algorithm-Performance Personas’ [Proposal]},
author = {Joeran Beel and Bryan Tyrell and Edward Bergman and Andrew Collins and Shahad Nagoor},
year = {2020},
date = {2020-01-01},
journal = {arXiv:2006.12328 [cs.LG]},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Al-Rawi, Mohammed; Beel, Joeran
Towards an Interoperable Data Protocol Aimed at Linking the Fashion Industry with AI Companies Journal Article
In: arXiv:2009.03005, 2020.
@article{AlRawi2020,
title = {Towards an Interoperable Data Protocol Aimed at Linking the Fashion Industry with AI Companies},
author = {Mohammed Al-Rawi and Joeran Beel},
url = {https://arxiv.org/abs/2009.03005},
year = {2020},
date = {2020-01-01},
journal = {arXiv:2009.03005},
abstract = {The fashion industry is looking forward to use artificial intelligence technologies to enhance their processes, services, and applications. Although the amount of fashion data currently in use is increasing, there is a large gap in data exchange between the fashion industry and the related AI companies, not to mention the different structure used for each fashion dataset. As a result, AI companies are relying on manually annotated fashion data to build different applications. Furthermore, as of this writing, the terminology, vocabulary and methods of data representation used to denote fashion items are still ambiguous and confusing. Hence, it is clear that the fashion industry and AI companies will benefit from a protocol that allows them to exchange and organise fashion information in a unified way. To achieve this goal we aim (1) to define a protocol called DDOIF that will allow interoperability of fashion data; (2) for DDOIF to contain diverse entities including extensive information on clothing and accessories attributes in the form of text and various media formats; and (3)To design and implement an API that includes, among other things, functions for importing and exporting a file built according to the DDOIF protocol that stores all information about a single item of clothing. To this end, we identified over 1000 class and subclass names used to name fashion items and use them to build the DDOIF dictionary. We make DDOIF publicly available to all interested users and developers and look forward to engaging more collaborators to improve and enrich it.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2019 and earlier
For a full list of all publications, please visit our publications page or my Google Scholar page.
Photo of Prof. Joeran Beel: (C) 2021 Sascha Hüttenhain Photography