We offer a paid internship for Bachelor students from Greece, Italy, Portugal, and Spain

We are glad to announce that we, again, will offer a paid internship in cooperation with the German Academic Exchange service (DAAD). If you are an undergraduate student, interested in software engineering or statistics, and coming from the Greece, Italy, Portugal, or Spain, get yourself started and do an 8-12 weeks internship in summer or autumn 2013, fully paid. And if you are not applicable to apply for the internship – please tell your friends to apply! 🙂

Your project

Your research question to answer will be “How to provide (better) research paper recommendations to our users?”. As such, it will be your task to support the Docear team in researching how the interests of Docear’s users can be identified from the users’ mind maps and how these interests can be matched with interesting items to recommend. You will do literature research, create new ideas, analyze user data, and implement new recommendation approaches in JAVA. Of course, you don’t have to do all of this alone – you will be closely cooperating with the Docear team. Your work will be integrated into Docear and used by thousands of researchers around the world. If your work is outstanding, we will write a research paper with you.

Requirements

You should have a profound knowledge of the programming language JAVA. Knowledge in statistics, machine learning, other programming languages (especially C/++ or Python) and/or MySQL, neo4j, Hibernate, Jersey, REST Web Services, Tomcat, and Apache is a plus, but not a requirement. Of course, we would appreciate if you spoke German but it would be no problem if you only spoke English. We would prefer, if you apply for a long internship (12 weeks) but you can also apply for a shorter internship. If you are interested in combining your internship with writing a Bachelor thesis, please let us know in advance (this would be highly welcome). You can start at any date you want in summer or autumn 2013.

Important: If you don’t want to program but have profound knowledge in statistics you are also very welcome to apply. In this case you will support us evaluating how good our current recommender system is, and you will help us generating ideas for improvements. Please indicate in your application clearly that you are not interested in software development but in statistics.

(more…)

Docear Beta 9 with several bug fixes and feature enhancements

The 9th Beta of Docear is available for free download. It contains no new features but several bug fixes and feature improvements. One improvement includes the removal of line breaks in imported annotations. So far, when you highlighted text over several lines in a PDF, Docear imported the lines breaks of the highlighted text which sometimes caused a not so nice layout. In the new Docear you can do a right click on a node that links an annotation and select “PDF->Remove lines breaks from annotation”. Bug fixes include a  fix for the bug that the Adobe Acrobat Professional PDF Viewer wasn’t recognized under MacOS. On Linux the splash screen does not hide any more the setup screen on the very first start of Docear. For a detailed list of all changes see the following change log. And keep in mind – we are always looking forward for your feedback and don’t forget to have a look at the new preview of Docear’s Online Viewer.

Feature Enhancements:

  • #678 Adobe Acrobat Professional is included in PDF Viewer recognition on MacOS
  • #782 Function to remove line breaks in annotations
  • #784 More default file types to be imported

(more…)

Preview of the Docear Online Viewer

A few month ago we announced to develop an add-on allowing researchers to collaborate on the same data. Well, we haven’t finished this completely yet but the first step is done. In the past few month we developed an online viewer that allows you to view your backuped mind maps in your web browser. Right now the viewer is only capable of displaying very small mind maps but it’s our highest priority to improve the performance so you can also view larger mind maps. In addition, the online viewer displays only the basic elements such as nodes and edges. Other features such as attributes are not yet displayed but, again, we are working on it :-). Of course, this is just the beginning. As a next step Docear Online will allow you to edit your mind maps online. The following step will enable you to work simultaneously with different researchers on the same data either in your browser or with your desktop version of Docear.

docear online preview

To have a look at the preview version of Docear Online, go to https://my.docear.org and log in with your Docear user name and password. You will be able to view all those mind maps that have been created with the desktop version of Docear and that have been backuped with it. Simply select your mind map in the upper-right corner in the menu. To activate the backup function start Docear Desktop and open Tools->Preferences->Online Services->Manage Docear Service Settings.

(more…)

Docear4Word 1.01 with bug-fixes and some enhancements

Today we released Docear4Word 1.01. The add-on for Microsoft Word allows you to manage your Docear references (and any other references stored as BibTeX) directly within Microsoft Word. Version 1.01 includes the following changes. Download it here!

  • Added warning message if a BibTex file is considered corrupt, rather than just ignoring it. (#740, #692)
  • BibTex load now also supports CP1252 codepage.
  • Unexpected exceptions now logged to the log file. (#740)
  • Added workaround for missing BibTex keys. We create a new one of the form “_Unknown_XX” where XX increase with each missing key within the file.
  • DEV: Added MLA.csl sample file.
  • Fixed bug where the ID was being used instead of the Name.
  • Made toolbar dropdown wider.
  • Added warning message and instruction when no BibTex database is configured. (#740)
  • Parser now copes with no tags present.
  • Removed paragraph formatting from within Field code as it influence formatting in the main document.
  • JSON is now stored within the Field with space separators and LineFeeds since it could influence formatting in the main document. (#729)
  • Updated Citeproc.js to v1.0.426 which fixed these issues:
    • Incorrect trimming of punctuation. (#743)
    • “Tri-graph” styles not working. (#694)
    • Failure to load some styles containing comments.
  • BibTex Lexer now supports unix line endings. (#692)
  • Issue tag is now supported (#743)
  • Fixed bug with Issue and Number casing.

(more…)

Mendeley to be sold for $100M to Elsevier?

As a Docear user you probably did some research before you decided to use Docear and maybe you stumbled upon the reference manager Mendeley. Mendeley definitely has some nice features and made it to one of the top reference management tools in the past few years (besides the fact that they don’t use mind maps for literature management, the main reason I wouldn’t use Mendeley is the fact that they store the annotations you make in PDFs in a proprietary format — this locks you in to Mendeley and makes it really hard/impossible to switch to another tool). Two days ago Techcrunch reported that the well known publisher Elsevier takes an interest in buying Mendeley for presumably 100.000.000 US$. That’s right: 100 Million US$. Considering that Mendeley is supposed to have 2 Million users that would be 50$ per user (and I don’t know if the 2 Million users are really active users). As far as I remember, the shareholders of Facebook payed about 100 Dollars per user when Facebook shares were first available at the stock market. Not bad :-).

What do you think? Is Mendeley worth 100 Million Dollar? Is it a smart move from Elsevier to buy Mendeley? And what are the consequences for Mendeley’s users since Elsevier is known for a very harsh publishing policy which lead to a boycott of Elsevier and lots of criticism by many academics).

(more…)

WYSIWYG citation style editor for Docear4Word’s citation styles

Docear4Word is an add-on that allows managing your Docear references in Microsoft Word. It uses the citation style language (CSL), an open XML-based language to describe the formatting of citations and bibliographies. Not only Docear is using CSL but also other reference managers such as Zotero (who initiated the development of CSL) and Mendeley and they are all contributing their styles – this is why there are more than 2,000 citation styles you can use with Docear4Word, and the other reference managers. However, sometimes the citation style you need is not in the citation style repository and up to now it was quite challenging to create a new style (or edit an existing one).

During the past months, the Columbia University Libraries, Alfred P. Sloan Foundation, and Mendeley developed a WYSIWYG editor for citation styles  (What You See Is What You Get). This editor makes it easier than ever to edit existing styles and create new ones. So, if you are missing a citation style for a journal or conference you are submitting a paper to, well… create it and send a big thank you to the three organizations making this possible! :-).

(more…)

Bachelor students: Do a paid internship in software engineering or statistics here at Docear

We are glad to announce that we, again, will offer a paid internship in cooperation with the German Academic Exchange service (DAAD). If you are an undergraduate student, interested in software engineering or statistics, and coming from the US, UK or Canada, get yourself started and do an 8-12 weeks internship in summer or autumn 2013, fully paid. And if you are not applicable to apply for the internship – please tell your friends to apply! 🙂

Your project

Your research question to answer will be “How to provide (better) research paper recommendations to our users?”. As such, it will be your task to support the Docear team in researching how the interests of Docear’s users can be identified from the users’ mind maps and how these interests can be matched with interesting items to recommend. You will do literature research, create new ideas, analyze user data, and implement new recommendation approaches in JAVA. Of course, you don’t have to do all of this alone – you will be closely cooperating with the Docear team. Your work will be integrated into Docear and used by thousands of researchers around the world. If your work is outstanding, we will write a research paper with you.

Requirements

(more…)

The second Freeplane/Docear Developer Conference: User Friendliness, Collaboration and Scripting

Last year in July we met the core developers of the mind mapping software Freeplane in Munich (Docear’s mind mapping component is based on Freeplane). At that meeting we decided that Freeplane and Docear would closely cooperate. Now, more than a year later it was time to meet again and discuss the next big steps in the development of Freeplane and Docear. So we met this weekend in Magdeburg, Docear’s “headquarter”. And while last year six people attended the meeting, this year we were nine participants (three from Docear, three from Freeplane and three students from HTW Berlin) plus one Freeplane developer attending via video. In the following I would like to provide a brief overview of the discussed topics and results.

The Freeplane and Docear team in Docear's office

The Freeplane and Docear team in Docear’s office in Magdeburg (Germany)

(more…)

We need your help (i.e. a server) to build a repository for academic PDF files

It’s a while ago that we started crawling the Web for academic PDFs to index them and use them for Docear’s research paper recommender system. Meanwhile, we have collected quite a few PDFs.  Unfortunately, in a foreseeable future, our servers’ disks will be full and the load of our servers is too high already (that’s why you sometimes won’t get recommendations in Docear – our servers simply are too busy).

Since our budget is tight and we don’t want to spend too much time for server administration neither, we are asking for your help: Do you have a server that you could spare? What we need is the following

(more…)

Docear partners with HTW Berlin and welcomes five new student-developers

There is amazing news – Docear cooperates with the HTW Berlin (Berlin’s university of applied sciences for technology and economy). We will supervise the Master’s projects of five students (Alexander, Florian, Julius, Michael, and Paul). Other than at most other universities, the student’s goal is not to do some theoretical work but gaining some real-world development experience– by joining Docear’s development team. That means, we roughly double our development power and we are not talking about a few weeks internship. We are talking about the next eight months working almost half-time, so there should really be some noteworthy results. We still have to discuss what exactly “The Five” will be doing but since all of them prefer web development and design, it will be definitely something web-based. Right now we are considering a simple web-version of Docear and a synchronization add-on to sync your files between different computers and the Web. Ideally, the add-on additionally allows you to work collaboratively on the same data with other Docear users.

(more…)

List of 6513 stop-words for 17 languages (English, German, French, Italian, and many others)

To optimize Docear’s research paper recommender system I was looking for an extensive stop word list –  a list of words that is ignored for the analysis of your mind maps and research papers (for instance ‘the’, ‘and’, ‘or’, …). It’s easy to find some lists for some languages but I couldn’t find one extensive list for several languages. So I created one based on the stop lists from

  • http://dev.mysql.com/doc/refman/5.5/en/fulltext-stopwords.html
  • http://jmlr.csail.mit.edu/papers/volume5/lewis04a/a11-smart-stop-list/english.stop
  • http://members.unine.ch/jacques.savoy/clef/
  • http://norm.al/2009/04/14/list-of-english-stop-words/
  • http://snowball.tartarus.org/algorithms/english/stop.txt
  • http://solariz.de/649/deutsche-stopwords.htm
  • http://www.lextek.com/manuals/onix/
  • http://www.ranks.nl/resources/stopwords.html
  • http://www.textfixer.com/resources/common-english-words.php
  • http://www.translatum.gr/forum/index.php?topic=2476.0

In case anyone else needs such a stop word list: Here it is, 6513 stop words for English, French, German, Catalan, Czech, Danish, Dutch, Finish, Norwegian, Polish, Portuguese, Rumanian, Spanish, Swedish, and Turkish. I believe that some words have an encoding problem. If you discover an error, please let me know and I will correct it. Also, I wouldn’t be surprised to learn that a stop word from one language is an important word in another language.  If you discover some words in the list that should not be ignored by our research paper recommender system… please let us know 🙂

(more…)

Evaluations in Information Retrieval: Click Through Rate (CTR) vs. Mean Absolute Error (MAE) vs. (Root) Mean Squared Error (MSE / RMSE) vs. Precision

As you may know, Docear offers literature recommendations and as you may know further, it’s part of my PhD to find out how to make these recommendations as good as possible. To accomplish this I need to know what a ‘good’ recommendation is. So far we have been using Click Through Rates (CTR) to evaluate different recommendation algorithms. CTR is a common performance measure in online advertisement. For instance, if a recommendation is shown 1000 times and clicked 12 times, then the CTR is 1,2% (12/1000).  That means if an algorithm A has a CTR of 1% and algorithm B has a CTR of 2%, B is better.

Recently, we submitted a paper to a conference. The paper summarized the results of some evaluations we did with different recommendation algorithms. The paper was rejected. Among others, a reviewer criticized the CTR as a too simple evaluation metric. We should rather use metrics that are common in information retrieval such as Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), or Precision (i.e. Mean Average Precision, MAE).

The funny thing is, CTR, MAE, MSE, RMSE and Precision are basically all the same, at least in a binary classification problem (recommendation relevant / clicked vs. recommendation irrelevant / not clicked). The table shows an example. Assume, you show ten recommendations to users (Rec1…Rec10). Then is the ‘Estimate’ for each recommendation ‘1’, i.e. it’s clicked by a user. The ‘Actual‘ value describes if a user actually clicked on a recommendation (‘1) or not (‘0’). The ‘Error’ is either 0 (if the recommendation actually was clicked) or 1 (if it was not clicked). The mean absolute error (MAE) is simply the sum of all errors (6 in the example) devided by the number of total recommendations (10 in the example). Since we have only zeros and ones, it makes no difference if they are squared or not. Consequently, the mean squared error (MSE) is identical to MAE. In addition, precision and mean average precision (MAP) is identical to CTR; precision (and CTR) is exactly 1-MAE (or 1-MSE), and also RMSE perfectly correlates with the other values because it’s simply the root square of MSE (or MAE).

Click Through Rate (CTR) vs. Mean Absolute Error (MAE) vs Mean Squared Error (MSE) vs Root Mean Squared Error (RMSE) vs Precision

In a binary evaluation (relevant / not relevant) in information retrieval, there is no difference in the significance between Click Through Rate (CTR), Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Precision.

(more…)

Don’t be shy: Write a testimonial about Docear

Today I was giving a presentation about Docear. Among others I wanted to show how positive many researchers respond to the concept of Docear. So I assembled a little picture with quotes  about Docear (SciPlore MindMapping respectively) (see figure below).

After the presentation one of the attendees told me, he was surprised to not find any of the quotes on our testimonial page. Well, he was absolutely right. The most enthusiastic feedback we receive is by email. I have no clue why, but users seem to be a little bit shy when posting their opinion publicly.

For us, it’s really important to know what users think and their feedback – your feedback – may really help us, for instance, to convince our university to provide us with further resources (at least if your feedback is positive ;-). So, I would like to ask you: If you are a Docear user, write a short testimonial. Don’t send it by email, don’t post in here as a comment – leave it on our testimonial page, and please be a little bit enthusiastic  :-). Surely, we don’t want you to write something you do not really think. But please also don’t hold back. And of course, we are also happy to hear constructive criticism. If you have ideas on improving Docear, if there is something you don’t like, let us know in our forum and we will happily discuss the issue with you.

(more…)

Docear4Word 1.0: Managing citations, bibliographies and references in Microsoft-Word based on BibTeX

For students and researchers reference management probably is the most tiring task in their daily work-routine. They have to re-type and format bibliographic information again and again, for each paper, assignment or thesis. This is particularly annoying if you need to change citation styles. As a student this might happen because your supervisor changes his mind on his favorite citation style and researchers constantly need to adjust citation styles because almost every journal and conference has its own requirements (see picture). Some reference management tools, such as Endnote, offer Add-Ons for Microsoft Word for inserting and formation references directly within a Word document. However, users of reference management software relying on the BibTeX standard had no such add-ons (there only is BibTeX4Word which is a good tool but very difficult to use). Until now.

Today, just after we released the brand new Docear 1.0 Beta 6,  we released Docear4Word 1.0. Docear4Word is an add-on for Microsoft Word (2003 and later) that allows you to insert references and bibliographies from BibTeX files to MS-Word documents. The great thing is that you don’t need to care about formatting. You can choose from 1,700+ citation styles (APA, MLA, Turabian, Harvard, IEEE, ACM, …).

IEEE Citation style vs. Elsevier’s Harvard Style

(more…)

Docear Beta 6 (Usability improvements)

Docear 1.0 Beta 6 is available for download. There are quite a few improvements and new features. The major change is a small icon to refresh the incoming mind map. Instead of selecting the entry in the menu, you can just click on the icon to get a new list of all your PDFs. In addition, when you create a new workspace (or start Docear for the very first time), the mind maps in the Library (Incoming, Literature & Annotations, …) will have a brief description for how to use them. We hope these two new features make Docear easier to use. Furthermore, Adobe Reader X and Professional are now correctly recognized by Docear and LaTeX users will be happy to hear that they can copy reference keys with a \cite{} prefix.

The complete changes are as follows

(more…)

Google Adwords Search vs. Google Adwords Display Network vs. Linkedin Ads

This post has nothing to do with Docear, but if you are interested in online marketing, it might be of interest to you. A few days ago, LinkedIn sent me a 50$ voucher for their new “LinkedIn Ads” program. LinkedIn Ads is similar to Google Adwords and allows organizations (such as Docear) to advertise on the profile pages of LinkedIn members (see screenshot).

I was curious how effective LinkedIn Ads would be and started a campaign. In addition, I started a campaign with Google Adwords (see screenshot below) which is the advertisement program of Google. Both campaigns were rather similar and had similar ads. However, results highly differed.

(more…)

Docear4Word 1.0 RC1 available for registered users

Docear4Word is ready to release but before we make it available for public download, we would like to publish another test version for our users. We have fixed several bugs, and added the feature to select a BibTeX file. By default, Docear4Word used the BibTeX file that is specified in Docear but you can specify any other BibTeX file you want. We also enhanced the field mapping but we cannot guarantee that everything is perfect. If you are missing some information in your bibliography (that are available in the BibTeX file), please let us know. But be aware that the reason might be the citation style – not all citation styles do show all fields. So, before you report some missing information, please test several citation styles and if they all don’t show a certain piece of information – well, than probably it’s a bug of Docear4Word.

(more…)

Docear 1.0 Beta 5 with Zotero and much better PDF Reader support

Yesterday we released Beta 5 of Docear with two major improvements. First, Docear fully supports Zotero. That means as a Zotero user you can use Zotero as you are used to and work with the same PDF files and references in Docear. For instructions on setting up Docear for Zotero read our manual. Second, we strongly improved the support for many PDF readers. Annotations from http://www.cerience.com/products/reader
RepliGo may be imported by Docear and PDF XChange Viewer (PDFXV) now is fully supported: Docear will automatically adjust the settings of PDFXV so that all kind of annotations (highlighted text, comments, bookmarks) may be imported. In addition, on first start of Docear, or when you install a new PDF reader, Docear lets you select your preferred reader and, if the reader supports this, automatically sets the right settings for the “jump-to-page” feature (i.e. the PDF will be opened on the page of an annotation).

The screenshot shows the PDF-Reader selection dialog. It should list the following PDF readers (if installed on your system): Foxit Reader, Adobe Reader, PDF XChange Viewer, Skim, Preview. Adobe Acrobat Professional probably will be support in the next Beta. If on your system not all installed readers are displayed, please let us know.

So, which PDF reader is the most recommendable?

(more…)

Docear moves to a new office and welcomes an intern from Cambridge

It’ already a few weeks ago that we moved to a new office. The computer science department of our university (OvGU) sponsors it and it’s really great, and huge.  It has space for six or even seven work spaces and there is a second room for another two or three people (again a big ‘thank you’ to  our university and our mentor Prof. Andreas Nürnberger). As you maybe know, the core team of Docear are three people  (Stefan, Marcel and me) and then there are a few students, volunteers and of course Bela, who co-founded SciPlore MindMapping, but who is in Berkeley now. That means, we have still some workspace to fill and we already began to fill it with a new intern :-). Two days ago, Cheng arrived from the university of Cambridge where he studies physics. He will stay for two month with us and will support the software development of Docear. This is particular good news for MacOS users because Cheng has a Mac and hence we can test Docear on a real Mac for the first time.

(Part of) the Docear team (Marcel, Stefan, Joeran, Cheng) in the new office

Btw. if you are interested in an internship, or writing your thesis about Docear, please read here and contact us. We have lots of great projects to work on, there are many research fields you can do research in, and, as you know, there is enough space in our office :-).

(more…)