Update 2018-07-31: We updated the Dropbox Link Google Scholar recently changed its layout, and as a consequence, Docear couldn’t fetch metadata anymore from Google Scholar for PDF files. Fortunately, one of our users (“Silberzwiebel”) adjusted Docear’s Google Scholar Parser, and now everything works as usual. However, we have not yet integrated Read more…
Mr. DLib Recommendations-as-a-Service v1.3: “Word Embeddings” and Many Minor Improvements and Bug Fixes
We released version 1.3 of Mr. DLib´s Recommender-System as-a-Service. The new major feature is “word embeddings” based recommendations. We are excited to see how the new recommendations will perform with our partners. In addition, we fixed many small bugs, and added some minor improvements. A complete overview can be found Read more…
The new version of our recommender system completes 104 issues and significantly improves the recommendations. The most notable improvements are: We improved the keyphrase extraction process in the recommender system, i.e. keyphrases are not stored differently in Lucene. We expect better recommendation effectiveness and are currently running an A/B test. More Read more…
There are two major news coming along with the new version of Mr. DLib’s Recommendation API. JabRef finally uses Mr. DLib for it’s recommender system We have announced this already a while ago, but now, finally, Mr. DLib’s recommendations are available in one of the most popular open-source reference managers, Read more…
On 28th February, we released version 1.1.1 of Mr. DLib’s recommender system with some minor improvements and bug fixes: Improved 404 error handling for unknown document IDs Fix: The order of authors in the XML was not sorted properly Several internal changes (adjusted logging table; click time is not updated any Read more…
So far, Mr. DLib’s recommender system was running only on a single server. Consequently, when me messed up something in the development environment, sometimes the production system was affected, i.e. down. From today on, we have two additional dedicated servers running, meaning we have a total of three recommender-system servers, one for Read more…
Docear 1.1 Beta Released: New PDF Metadata Extraction, Better Zotero and Mendeley BibTeX support, and Bug Fixes
If you have tested the Preview of Docear 1.1 you may already know about some of Docear’s new features. With your feedback and the mind maps, log files and BibTeX files you shared with us, these features have matured. We are proud to introduce the first (and hopefully only) Beta release of Docear 1.1.
The new key features of Docear 1.1
Improved metadata retrieval
Thanks to your donations, our student Christoph greatly enhanced Docear’s PDF metadata retrieval. For us, it works really great, and with Docear 1.1 Beta the last bugs have been fixed. Btw. if you like what Christoph did, and if you are using LibreOffice, or OpenOffice, please also read our call for donation to develop an add-on for these two text processing tools.
Improved support for Zotero / Mendeley BibTeX files
Thanks to all the generous donors, our student Christoph could work on an improved PDF metadata retrieval for Docear, and today it’s time to present the first preview. The new Docear 1.1 (preview) is able to extract the title of a PDF and fetch appropriate metadata from Google Scholar. Whenever you select a PDF in your mind-map and chose “Create or Update reference”, the following new dialog appears.
The dialog shows the file name of your PDF file, and the extracted title. In the background, the extracted title is sent to Google Scholar and metadata for the first three search results are shown in the dialog. If the title was extracted incorrectly, you can manually correct it. You may also chose to use the PDF’s file name for the search. For instance, when you named your PDF already according to the title, select the radio button with the file name, and the file name is sent as search query to Google Scholar (you may also manually correct the file name before it’s sent to Google Scholar). Of course, all other options you already know are still available, such as creating a blank entry, or importing the XMP data of PDFs. Btw. Docear remembers your choice, i.e. when you select to create a blank entry, the option will be pre-selected when open that dialog the next time. It might happen, that your IP will be blocked by Google Scholar when you use the service too frequently. If this happens, a captcha should appear, and after solving it, you should be able to proceed. We did not yet test this thoroughly. Please let us know your experiences.
The precision of our metadata tool depends on two factors, A) the precision of the title extraction and B) the coverage of Google Scholar. According to a recent experiment, title extraction of our tool is around 70%. However, the final result very much depends on the format of your research articles. In my research field (i.e. recommender systems), I would say that our tool extracts the title correctly for about 90% of the articles in my personal library. In addition, almost all articles that are relevant for my research are indexed by Google Scholar (i would estimate, more than 90%). This means, for around 80% of my PDFs the correct metadata is retrieved fully automatically. Given that I provide the title manually, for even more than 90% the metadata may be retrieved. Please let us know your experience (and your research field). (more…)
Update: February 18, 2014: No bugs were reported, as such we declare Docear 1.03 with its recommender system as stable. It can be downloaded on the normal download page.
With Docear 1.0.3 beta we have improved PDF handling, the recommender system, provided some help for new users and enhanced the way how you can access your mind maps online.
We fixed several minor bugs with regard to PDF handling. In previous versions of Docear, nested PDF bookmarks were imported twice when you drag & dropped a PDF file to the mind map. Renaming PDF files from within Docear changed the file links in your mind maps but did not change them in your BibTeX file. Both issues are fixed now. To rename a PDF file from within Docear you just have to right-click it in Docear’s workspace panel on the left hand side and it is important that the mind maps you have linked the file in, are opened. We know, this is still not ideal, and will improve this in future versions of Docear.
Rate Your Recommendations
You already know about our recommender system for academic literature. If you want to help us improving it, you can now rate how good a specific set of recommendations reflects your personal field of interest. Btw. it would be nice if you do not rate a set of recommendations negatively only because it contains some recommendations you received previously. Currently, we have no mechanism to detect duplicate recommendations.
The new Docear4Word v1.23 is out as Beta version. Changes are A more detailed error message when there is a parsing error in your BibTeX file. The latest v1.0.517 version of CiteProc-JS has been included. This should finally solve all the sorting and numbering issues. We made some adjustment that Read more…
Today, Docear 1.0 (stable) is finally available for Windows, Mac, and Linux to download. It’s been almost two years since we released the first private Alpha of Docear and we are really proud of what we accomplished since then. Docear is better than ever, and in addition to all the enhancements we made during the past years, we completely rewrote the manual with step-by-step instructions including an overview of supported PDF viewers, we changed the homepage, we created a new video, and we made the features & details page much more comprehensive. For those who already use Docear 1.0 RC4, there are not many changes (just a few bug fixes). For new users, we would like to explain what Docear is and what makes it so special.
Docear is a unique solution to academic literature management that helps you to organize, create, and discover academic literature. The three most distinct features of Docear are:
- A single-section user-interface that differs significantly from the interfaces you know from Zotero, JabRef, Mendeley, Endnote, … and that allows a more comprehensive organization of your electronic literature (PDFs) and the annotations you created (i.e highlighted text, comments, and bookmarks).
- A ‘literature suite concept’ that allows you to draft and write your own assignments, papers, theses, books, etc. based on the annotations you previously created.
- A research paper recommender system that allows you to discover new academic literature.
Aside from Docear’s unique approach, Docear offers many features more. In particular, we would like to point out that Docear is free, open source, not evil, and Docear gives you full control over your data. Docear works with standard PDF annotations, so you can use your favorite PDF viewer. Your reference data is directly stored as BibTeX (a text-based format that can be read by almost any other reference manager). Your drafts and folders are stored in Freeplane’s XML format, again a text-based format that is easy to process and understood by several other applications. And although we offer several online services such as PDF metadata retrieval, backup space, and online viewer, we do not force you to register. You can just install Docear on your computer, without any registration, and use 99% of Docear’s functionality.
But let’s get back to Docear’s unique approach for literature management…
There is a new version of Docear available for download. It’s basically the (experimental) RC1 version done right. RC2 fixes a lot of bugs that were caused by the new workspace model with multiple projects, it features a refined and polished version of the Ribbon, fixes a lot of bugs in general and supports the standard PDF viewers of MacOSX (Preview and Skim) and probably a lot of other viewers as well!
If you are still using Beta9 of Docear, a lot of things will change and improve with this new version of Docear. However converting your old maps to this new format is a one-way process (you can’t use these files with Beta9 of Docear anymore) and the process itself might take some time, depending on the size of your mind maps. Please backup your files before upgrading to Docear RC2.
A while ago we released Docear with automatic metadata extraction. Several users asked us if this functionality was also available as stand-alone tool. Therefore, we decided to create a little tool which we name “Docear’s PDF Inspector”. Docear’s PDF Inspector extracts titles from PDF files not from the PDF’s metadata but from its full-text. More precisely, Docear’s PDF Inspector extracts the full-text of the first page of a PDF and looks for the largest text in the upper third of that page. This text is returned as title. Of course, this does not always deliver the correct title (e.g. sometimes the journal name is formatted in a larger font size than an article’s title) but in about 70% you will get the correct one. The main features of Docear’s PDF Inspector are
- Extracts titles from PDF files with good accuracy (~70%) and excellent run-time (few milliseconds per PDF in batch mode)
- Usable as library (other tools can easily integrate Docear’s PDF Extractor to extract titles from PDFs.
- Usable as stand-alone command-line application (returns a PDFs’ title on the command line)
- Usable in batch mode (stores the extracted titles into a CSV file)
- Reads all PDF versions (other tools such as SciPlore Xtract or ParsCit are using PDFBox for processing the PDFs. However, PDFBox sometimes has problems extracting text from PDFs not being 100% compliant to the PDF standard – Docear’s PDF Inspector is more tolerant)
- Written entirely in JAVA 1.6. Hence, Docear’s PDF Inspector runs on any major operating system, including Windows, Linux, and Mac OS, without any other tools required (besides the JAVA run time environment, of course)
- Completely independent of further tools – you only need Docear’s PDF Inspector, that’s it (e.g. SciPlore Xtract requires pdftohtml to be installed)
- Released under the GNU General Public License (GPL) 2 or later, which means it is completely free to use and its source code can be downloaded and modified by anyone.
Beta 12 has many new features and improvements
- New: Incoming PDFs are now displayed in seperate window
- New: ‘Import All’ and ‘Import New’ Bookmarks
- Improved: Update of the monitoring node is now MUCH, MUCH faster
- Improved: Better understandable error messages when the web service is not available (for mind map backup, user validation etc.)
- Improved: Logging events are sent up to three times if connection breaks
- Improved: Better exception handling if no internet connection exists
- Improved: Icons are now in higher resolution (more…)