Two of our papers about citation and term-weighting schemes got accepted at iConference 2017

Title page of the CC-IDF Paper

Two of our papers about weighting citations and terms in the context of user modeling and recommender systems got accepted at the iConference 2017. Here are the abstracts, and links to the pre-print versions:

Evaluating the CC-IDF citation-weighting scheme: How effectively can ‘Inverse Document Frequency’ (IDF) be applied to references?

In the domain of academic search engines and research-paper recommender systems, CC-IDF is a common citation-weighting scheme that is used to calculate semantic relatedness between documents. CC-IDF adopts the principles of the popular term-weighting scheme TF-IDF and assumes that if a rare academic citation is shared by two documents then this occurrence should receive a higher weight than if the citation is shared among a large number of documents. Although CC-IDF is in common use, we found no empirical evaluation and comparison of CC-IDF with plain citation weight (CC-Only). Therefore, we conducted such an evaluation and present the results in this paper. The evaluation was conducted with real users of the recommender system Docear. The effectiveness of CC-IDF and CC-Only was measured using click-through rate (CTR). For 238,681 delivered recommendations, CC-IDF had about the same effectiveness as CC-Only (CTR of 6.15% vs. 6.23%). In other words, CC-IDF was not more effective than CC-Only, which is a surprising result. We provide a number of potential reasons and suggest to conduct further research to understand the principles of CC-IDF in more detail

Pre-print: http://beel.org/publications/2017%20iConference%20–%20Evaluating%20the%20CC-IDF%20citation-weighting%20scheme%20–%20preprint.pdf

TF-IDuF: A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections.

TF-IDF is one of the most popular term-weighting schemes, and is applied by search engines, recommender systems, and user modeling engines. With regard to user modeling and recommender systems, we see two shortcomings of TF-IDF. First, calculating IDF requires access to the document corpus from which recommendations are made. Such access is not always given in a user-modeling or recommender system. Second, TF-IDF ignores information from a user’s personal document collection, which could – so we hypothesize – enhance the user modeling process. In this paper, we introduce TFIDuF as a term-weighting scheme that does not require access to the general document corpus and that considers information from the users’ personal document collections. We evaluated the effectiveness of TF-IDuF compared to TF-IDF and TF-Only and found that TF-IDF and TF-IDuF perform similarly (clickthrough rates (CTR) of 5.09% vs. 5.14%), and both are around 25% more effective than TF-Only (CTR of 4.06%) for recommending research papers. Consequently, we conclude that TF-IDuF could be a promising term-weighting scheme, especially when access to the document corpus for recommendations is not possible, and thus classic IDF cannot be computed. It is also notable that TF-IDuF and TF-IDF are not exclusive, so that both metrics may be combined to a more effective term-weighting scheme.

Pre-print: http://beel.org/publications/2017%20iConference%20–%20TF-IDuF%20-%20A%20Novel%20Term-Weighting%20Scheme%20–%20preprint.pdf

ch finde: Wenn die KI den Prozess gewinnt, darf sie auch das Siegerbild zeichnen. (C) 2026, Prof. Dr. Jöran Beel via ChatGPT

Research

Mit ChatGPT als ‘KI-Anwalt’ gegen Rossmann: Ranziger Geruch beim Baby-Grieß? Geld zurück? Rossmann sagt 2 x „Nein“ – Gericht urteilt anders!

Zusammenfassung: Eine Kundin kauft Babygrieß bei Rossmann, meint ranzigen Geruch wahrzunehmen und reklamiert. In der Filiale wird die Kundin nach langer Diskussion des Ladens verwiesen und die Rücknahme mit Argumenten verweigert, die später vor Gericht Read more…

Publications

Tobias Vente presents the APS Explorer at ACM Recsys’25: Navigating Algorithm Performance Spaces for Informed Dataset Selection

We are excited to share that our PhD student, Tobias Vente, presented our research paper, “APS Explorer: Navigating Algorithm Performance Spaces for Informed Dataset Selection”, at the ACM RecSys 2025 conference held at the O2 Universum Convention Center in Prague, Czechia The Read more…

Publications

Our work on ‘Green Recommender Systems’ was just accepted for publication at ACM TORS

We are pleased to announce that our paper, “Green Recommender Systems: Understanding and Minimizing the Carbon Footprint of AI-Powered Personalization,” has been accepted for publication in ACM Transactions on Recommender Systems (TORS). It is already Read more…

Two of our papers about citation and term-weighting schemes got accepted at iConference 2017

Published by Joeran Beel on 15th December 2016

Evaluating the CC-IDF citation-weighting scheme: How effectively can ‘Inverse Document Frequency’ (IDF) be applied to references?

TF-IDuF: A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections.

Joeran Beel

0 Comments

Leave a Reply Cancel reply

Mit ChatGPT als ‘KI-Anwalt’ gegen Rossmann: Ranziger Geruch beim Baby-Grieß? Geld zurück? Rossmann sagt 2 x „Nein“ – Gericht urteilt anders!

Tobias Vente presents the APS Explorer at ACM Recsys’25: Navigating Algorithm Performance Spaces for Informed Dataset Selection

Our work on ‘Green Recommender Systems’ was just accepted for publication at ACM TORS

Two of our papers about citation and term-weighting schemes got accepted at iConference 2017

Published by Joeran Beel on 15th December 2016

Evaluating the CC-IDF citation-weighting scheme: How effectively can ‘Inverse Document Frequency’ (IDF) be applied to references?

TF-IDuF: A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections.

Joeran Beel

0 Comments

Leave a Reply Cancel reply

Related Posts

Mit ChatGPT als ‘KI-Anwalt’ gegen Rossmann: Ranziger Geruch beim Baby-Grieß? Geld zurück? Rossmann sagt 2 x „Nein“ – Gericht urteilt anders!

Tobias Vente presents the APS Explorer at ACM Recsys’25: Navigating Algorithm Performance Spaces for Informed Dataset Selection

Our work on ‘Green Recommender Systems’ was just accepted for publication at ACM TORS