Green Recommender Systems: Down-Sampling Datasets for Energy-Efficient Algorithm Performance

Abstract As recommender systems become increasingly prevalent, the environmental impact and energy efficiency of training these large-scale models have come under scrutiny. This paper investigates the potential for energy-efficient algorithm performance by optimizing dataset sizes through downsampling techniques. We conducted experiments on the MovieLens 100K, 1M, 10M and Amazon Toys Read more…

8 Recommender Systems Illustration

ISG will present 8 papers and posters at the ACM Recommender Systems Conference and Workshops

We are thrilled that 8 of our 11 submissions to the 18th ACM Recommender-Systems Conference and Workshops (RobustRecSys and RecSoGood) were accepted for publication. Our research was conducted jointly with partners from the University of Gothenburg (Alan Said), the University of Antwerpen (Lien Michiels), and some excellent Bachelor and Master Read more…

e fold and not k fold (Green Recommender Systems)

From Theory to Practice: Implementing and Evaluating e-Fold Cross-Validation

Accepted for publication at the International Conference on Artificial Intelligence and Machine Learning Research (CAIMLR). The PDF is available here. Feel free to also read the original proposal that led to the current publication. Abstract In this paper, we present e-fold cross-validation, an energy-efficient alternative to k-fold, which dynamically adjusts Read more…

From Clicks to Carbon: The Ecological Costs of Recommender Systems (Pre-Print)

Full pre-print as PDF: https://arxiv.org/abs/2408.08203 Abstract As global warming soars, the need to assess the environmental impact of research is becoming increasingly urgent. Despite this, few recommender systems research papers address their environmental impact. In this study, we estimate the ecological impact of recommender systems research by reproducing typical experimental Read more…

Synthetic vs. Real Reference Strings for Citation Parsing, and the Importance of Re-training and Out-Of-Sample Data for Meaningful Evaluations: Experiments with GROBID, GIANT and CORA [pre-print]

ABSTRACT Citation parsing, particularly with deep neural networks, suffers from a lack of training data as available datasets typically contain only a few thousand training instances. Manually labelling citation strings is very time-consuming, hence synthetically created training data could be a solution. However, as of now, it is unknown if Read more…

GIANT 2019, Reference Parsing, Deep Citation Parsing, Dataset, Cover

GIANT: The 1-Billion Annotated Synthetic Bibliographic-Reference-String Dataset for Deep Citation Parsing [pre-print]

This is the pre-print of: Mark Grennan, Martin Schibel, Andrew Collins, and Joeran Beel. “GIANT: The 1-Billion Annotated Synthetic Bibliographic-Reference-String Dataset for Deep Citation Parsing.” In 27th AIAI Irish Conference on Artificial Intelligence and Cognitive Science, 101–112, 2019. final Publication: http://aics2019.datascienceinstitute.ie/papers/aics_25.pdf Abstract. Extracting and parsing reference strings from research articles Read more…

Keyphrase counts and their effect on clickthrough rates (CTR)

Document Embeddings vs. Keyphrases vs. Terms: An Online Evaluation in Digital Library Recommender Systems

Our paper “Document Embeddings vs. Keyphrases vs. Terms: An Online Evaluation in Digital Library Recommender Systems” was accepted for publication at the ACM/IEEE Joint Conference on Digital Libraries. 1 Introduction Many recommendation algorithms are available to operators of recommender systems in digital libraries. The effectiveness of algorithms in real-world systems is Read more…

Click-through rate (CTR) and # of delivered recommendation in JabRef for Mr. DLib’s (MDL) and CORE’s recommendation engine and in total

Mr. DLib’s Living Lab for Scholarly Recommendations (preprint)

We published a manuscript on arXiv about the first living lab for scholarly recommender systems. This lab allows recommender-system researchers to conduct online evaluations of their novel algorithms for scholarly recommendations, i.e., research papers, citations, conferences, research grants etc. Recommendations are delivered through the living lab´s API in platforms such Read more…

The results of the comparison of 10 open-source bibliographic reference parsers

Machine Learning vs. Rules and Out-of-the-Box vs. Retrained: An Evaluation of Open-Source Bibliographic Reference and Citation Parsers

Our paper “Machine Learning vs. Rules and Out-of-the-Box vs. Retrained: An Evaluation of Open-Source Bibliographic Reference and Citation Parsers” got recently accepted and will be presented at Joint Conference on Digital Libraries 2018. Abstract: Bibliographic reference parsing refers to extracting machine-readable metadata, such as the names of the authors, the Read more…

The workflow of author contributions extraction

Who Did What? Identifying Author Contributions in Biomedical Publications using Naïve Bayes

Our paper “Who Did What? Identifying Author Contributions in Biomedical Publications using Naïve Bayes” got recently accepted and will be presented at Joint Conference on Digital Libraries 2018. Abstract: Creating scientific publications is a complex process. It is composed of a number of different activities, such as designing the experiments, Read more…

RARD I: The Related-Article Recommender-System Dataset

RARD: The Related-Article Recommendation Dataset

We are proud to announce the release of ‘RARD’, the related-article recommendation dataset from the digital library Sowiport and the recommendation-as-a-service provider Mr. DLib. The dataset contains information about 57.4 million recommendations that were displayed to the users of Sowiport. Information includes details on which recommendation approaches were used (e.g. content-based Read more…

Several new publications: Mr. DLib, Lessons Learned, Choice Overload, Bibliometrics (Mendeley Readership Statistics), Apache Lucene, CC-IDF, TF-IDuF

In the past few weeks, we published (or received acceptance notices for) a number of papers related to Mr. DLib, research-paper recommender systems, and recommendations-as-a-service. Many of them were written during our time at the NII or in collaboration with the NII. Here is the list of publications: Beel, Joeran, Bela Gipp, Read more…

Paper accepted at ISI conference in Berlin: “Stereotype and Most-Popular Recommendations in the Digital Library Sowiport”

Our paper titled “Stereotype and Most-Popular Recommendations in the Digital Library Sowiport” is accepted for publication at the 15th International Symposium on Information Science (ISI) in Berlin. Abstract: Stereotype and most-popular recommendations are widely neglected in the research-paper recommender-system and digital-library community. In other domains such as movie recommendations and hotel Read more…

Enhanced re-ranking in our recommender system based on Mendeley’s readership statistics

Content-based filtering recommendations suffer from the problem that no human quality assessments are taken into account. This means a poorly written paper ppoor would be considered equally relevant for a given input paper pinput as high-quality paper pquality if pquality and ppoor contain the same words. We elevate for this problem by using Mendeley’s readership data Read more…