Mr. DLib v1.1 released: JavaScript Client, 15 million CORE documents, new URL for recommendations-as-a-service via title search

We are proud to announce version 1.1 of Mr. DLib’s Recommender-System as-a-Service. The major new features are: A JavaScript Client to request recommendations from Mr. DLib. The JavaScript offers many advantages compared to a server-side processing of our recommendations. Among others, the main page will load faster while recommendations are requested in the Read more…

Paper accepted at ISI conference in Berlin: “Stereotype and Most-Popular Recommendations in the Digital Library Sowiport”

Our paper titled “Stereotype and Most-Popular Recommendations in the Digital Library Sowiport” is accepted for publication at the 15th International Symposium on Information Science (ISI) in Berlin. Abstract: Stereotype and most-popular recommendations are widely neglected in the research-paper recommender-system and digital-library community. In other domains such as movie recommendations and hotel Read more…

Enhanced re-ranking in our recommender system based on Mendeley’s readership statistics

Content-based filtering recommendations suffer from the problem that no human quality assessments are taken into account. This means a poorly written paper ppoor would be considered equally relevant for a given input paper pinput as high-quality paper pquality if pquality and ppoor contain the same words. We elevate for this problem by using Mendeley’s readership data Read more…

Server status of Mr. DLib’s recommender system publicly available

Our servers for Mr. DLib’s recommender-system as-a-service (RaaS) are now monitored by UptimeRobot, a free monitoring service. You can access all our RaaS server statuses at this URL https://stats.uptimerobot.com/WLL5PUjN6 and you will see a dashboard like this: A click on one of the server names will show you more details, e.g. https://stats.uptimerobot.com/WLL5PUjN6/778037437

New recommendation algorithms integrated to Mr. DLib’s recommender system

We have integrated several new recommendation algorithms into Mr. DLib. Some recommendation algorithms are only ought as baselines for our researchers, others hopefully will further increase the effectiveness of Mr. DLib. Overall, Mr. DLib now uses the following recommendation algorithms in its recommender system: Random Recommendations The approach recommendation randomly picks Read more…

First Pilot Partner (GESIS’ Sowiport) Integrates Mr. DLib’s Recommendations as-a-Service

We are proud to announce that the social science portal Sowiport is using Mr. DLib´s recommender-system as-a-service as first pilot partner. Sowiport pools and links quality information from domestic and international providers, making it available in one place. Sowiport currently contains 9.5 million references on publications and research projects. The Read more…

Docear 1.0.3 Beta: rate recommendation, new web interface, bug fixes, …

Update: February 18, 2014: No bugs were reported, as such we declare Docear 1.03 with its recommender system as stable. It can be downloaded on the normal download page.


With Docear 1.0.3 beta we have improved PDF handling, the recommender system, provided some help for new users and enhanced the way how you can access your mind maps online.

PDF Handling

We fixed several minor bugs with regard to PDF handling. In previous versions of Docear, nested PDF bookmarks were imported twice when you drag & dropped a PDF file to the mind map. Renaming PDF files from within Docear changed the file links in your mind maps but did not change them in your BibTeX file. Both issues are fixed now. To rename a PDF file from within Docear you just have to right-click it in Docear’s workspace panel on the left hand side and it is important that the mind maps you have linked the file in, are opened. We know, this is still not ideal, and will improve this in future versions of Docear.

Rate Your Recommendations

You already know about our recommender system for academic literature. If you want to help us improving it, you can now rate how good a specific set of recommendations reflects your personal field of interest. Btw. it would be nice if you do not rate a set of recommendations negatively only because it contains some recommendations you received previously. Currently, we have no mechanism to detect duplicate recommendations.

rate a literature recommendation set

(more…)

New paper: “A Comparative Analysis of Offline and Online Evaluations and Discussion of Research Paper Recommender System Evaluation”

Yesterday, we published a pre-print on the shortcomings of current research-paper recommender system evaluations. One of the findings was that results of offline and online experiments sometimes contradict each other. We did a more detailed analysis on this issue and wrote a new paper about it. More specifically, we conducted a comprehensive evaluation of a set of recommendation algorithms using (a) an offline evaluation and (b) an online evaluation. Results of the two evaluation methods were compared to determine whether and when results of the two methods contradicted each other. Subsequently, we discuss differences and validity of evaluation methods focusing on research paper recommender systems. The goal was to identify which of the evaluation methods were most authoritative, or, if some methods are unsuitable in general. By ‘authoritative’, we mean which evaluation method one should trust when results of different methods contradict each other.

Bibliographic data: Beel, J., Langer, S., Genzmehr, M., Gipp, B. and Nürnberger, A. 2013. A Comparative Analysis of Offline and Online Evaluations and Discussion of Research Paper Recommender System Evaluation. Proceedings of the Workshop on Reproducibility and Replication in Recommender Systems Evaluation (RepSys) at the ACM Recommender System Conference (RecSys) (2013), 7–14.

Our current results cast doubt on the meaningfulness of offline evaluations. We showed that offline evaluations could often not predict results of online experiments (measured by click-through rate – CTR) and we identified two possible reasons.

The first reason for the lacking predictive power of offline evaluations is the ignorance of human factors. These factors may strongly influence whether users are satisfied with recommendations, regardless of the recommendation’s relevance. We argue that it probably will never be possible to determine when and how influential human factors are in practice. Thus, it is impossible to determine when offline evaluations have predictive power and when they do not. Assuming that the only purpose of offline evaluations is to predict results in real-world settings, the plausible consequence is to abandon offline evaluations entirely.

(more…)

New pre-print: “Research Paper Recommender System Evaluation: A Quantitative Literature Survey”

As you might know, Docear has a recommender system for research papers, and we are putting a lot of effort in the improvement of the recommender system. Actually, the development of the recommender system is part of my PhD research. When I began my work on the recommender system, some years ago, I became quite frustrated because there were so many different approaches for recommending research papers, but I had no clue which one would be most promising for Docear. I read many many papers (far more than 100), and although there were many interesting ideas presented in the papers, the evaluations… well, most of them were poor. Consequently, I did just not know which approaches to use in Docear.

Meanwhile, we reviewed all these papers more carefully and analyzed how exactly authors conducted their evaluations. More precisely, we analyzed the papers for the following questions.

  1. To what extent do authors perform user studies, online evaluations, and offline evaluations?
  2. How many participants do user studies have?
  3. Against which baselines are approaches compared?
  4. Do authors provide information about algorithm’s runtime and computational complexity?
  5. Which metrics are used for algorithm evaluation, and do different metrics provide similar rankings of the algorithms?
  6. Which datasets are used for offline evaluations
  7. Are results comparable among different evaluations based on different datasets?
  8. How consistent are online and offline evaluations? Do they provide the same, or at least similar, rankings of the evaluated approaches?
  9. Do authors provide sufficient information to re-implement their algorithms or replicate their experiments?

(more…)

Three new research papers (for TPDL’13) about user demographics and recommender evaluations, sponsored recommendations, and recommender persistance

After three demo-papers were accepted for JCDL 2013, we just received notice that another three posters were accepted for presentation at TPDL 2013 on Malta in September 2013. They cover some novel aspects of recommender systems relating to re-showing recommendations multiple times, considering user demographics when evaluating recommender systems, and investigating the effect of labelling recommendations. However, you can read the papers yourself, as we publish them as pre-print:

Paper 1: The Impact of Users’ Demographics (Age and Gender) and other Characteristics on Evaluating Recommender Systems (Download PDF | Doc)

In this paper we show the importance of considering demographics and other user characteristics when evaluating (research paper) recommender systems. We analyzed 37,572 recommendations delivered to 1,028 users and found that elderly users clicked more often on recommendations than younger ones. For instance, users with an age between 20 and 24 achieved click-through rates (CTR) of 2.73% on average while CTR for users between 50 and 54 was 9.26%. Gender only had a marginal impact (CTR males 6.88%; females 6.67%) but other user characteristics such as whether a user was registered (CTR: 6.95%) or not (4.97%) had a strong impact. Due to the results we argue that future research articles on recommender systems should report demographic data to make results better comparable.

(more…)