These days, the code that drives a vast number of applications ranging from software on our laptops to apps on our smartphones is written in the programming language Java. A basic installation of Java, the so-called Java Runtime Environment (JRE) will be preinstalled on any appliances that need it to run. Consequently, most people never have to actually think about whether they “have Java” on their machine for most everyday software usage.

Java has also come to play a big role in scientific computing and many software tools commonly used in corpus and computational linguistics are written in Java. Many of them, especially those that come prepackaged with Graphical User Interfaces and with all the appearances of self-contained software, can make do with the Java Runtime Environment that ships with most operating systems. However, there are many software tools in corpus and computational linguistics that have more advanced requirements in that many of them need a version of the Java programming language that is called a Java Development Kit (JDK). Anyone having taken a Java programming course will remember having to install a JDK in order to be able to do any software development themselves. This is also true if we want to employ in our research workflow any one of the many tools that are commonly use in CCL such as for example the Stanford NLP software such as the Stanford PoS Tagger or the Stanford Core NLP Tools.

This section therefore provides some suggestions for installing a Java JDK as a basis for your linguistic processing environment. This used to be relatively straight forward for as long as a stock JDK installer package was made available by Oracle or, previously, Sun. However, as of 2019, Oracle have changed their licensing terms and conditions which means that in order to use a version of the JDK in development, we need to buy a license. The enterprise licenses offered by Oracle as of 2019 cost money and need to be upgraded frequently, so for students and researchers who are not always able to afford buying a license and who are frequently also in no position to constantly monitor that they are running their software on the correct license, Oracle and others are making available an open source version of the Java Development Kit called the Open JDK.

Java Open JDK

The Open JDK is made available by a number of parties. Oracle itself makes available a version that does not come with a very convenient installation support. Fortunately, other parties have stepped in and are making available versions of the Open JDK that ship with a more convenient installer and many of which are tested for full compliance with the original Oracle JDK. Some examples of sources for downloading the Open JDK are listed here, note that this is by no means a complete list and is in no way binding to students of my courses; it merely contains some tested sources suitable for single user environments such as our desktop machines:

There are many other possibilities. The HRZ at JLU Giessen has a very useful overview of what has changed and describes Java alternative solutions suitable for students and researchers.