bibliographic data – ISG Siegen

Preview of Docear 1.1 with PDF Metadata Retrieval from Google Scholar

Thanks to all the generous donors, our student Christoph could work on an improved PDF metadata retrieval for Docear, and today it’s time to present the first preview. The new Docear 1.1 (preview) is able to extract the title of a PDF and fetch appropriate metadata from Google Scholar. Whenever you select a PDF in your mind-map and chose “Create or Update reference”, the following new dialog appears.

The dialog shows the file name of your PDF file, and the extracted title. In the background, the extracted title is sent to Google Scholar and metadata for the first three search results are shown in the dialog. If the title was extracted incorrectly, you can manually correct it. You may also chose to use the PDF’s file name for the search. For instance, when you named your PDF already according to the title, select the radio button with the file name, and the file name is sent as search query to Google Scholar (you may also manually correct the file name before it’s sent to Google Scholar). Of course, all other options you already know are still available, such as creating a blank entry, or importing the XMP data of PDFs. Btw. Docear remembers your choice, i.e. when you select to create a blank entry, the option will be pre-selected when open that dialog the next time. It might happen, that your IP will be blocked by Google Scholar when you use the service too frequently. If this happens, a captcha should appear, and after solving it, you should be able to proceed. We did not yet test this thoroughly. Please let us know your experiences.

The precision of our metadata tool depends on two factors, A) the precision of the title extraction and B) the coverage of Google Scholar. According to a recent experiment, title extraction of our tool is around 70%. However, the final result very much depends on the format of your research articles. In my research field (i.e. recommender systems), I would say that our tool extracts the title correctly for about 90% of the articles in my personal library. In addition, almost all articles that are relevant for my research are indexed by Google Scholar (i would estimate, more than 90%). This means, for around 80% of my PDFs the correct metadata is retrieved fully automatically. Given that I provide the title manually, for even more than 90% the metadata may be retrieved. Please let us know your experience (and your research field). (more…)

By Joeran Beel, 12 years ago

Docear

Call for donation was successful: 1800 Euros donated to improve Docear’s PDF metadata retrieval function

One month ago, we started a call for donation and asked our users for money so we could pay our student Christoph to improve Docear’s PDF metadata retrieval. We asked for 1800 Euros (~2500 US$) and today we achieved our goal. We would like to thank all donors who Read more…

By Joeran Beel, 12 years ago

Docear

Now 50 instead of 15 free requests per day for bibliographic data

Just a quick notice: Since a while Docear can automatically retrieve bibliographic information for PDF files. So far, only 15 requests per day per users were possible. We increased this limit to 50. If you have not yet tried this feature, do it, it saves a lot of time!

By Joeran Beel, 13 years ago

Beta

Docear Beta 7 with PDF Metadata Extraction

Beta 7 is out and has one major new feature: (semi-)automatic extraction of bibliographic metadata from PDF files. That means, when creating a new reference, you don’t have to type everything manually but bibliographic information such as title, author, year, journal, etc. is all provided to you automatically. Here is how it works:

Do a right click on a node with a PDF and select as shown in the picture

(more…)

By Joeran Beel, 14 years ago