Table of Contents

System Requirements

This page lists software and tools that are not restricted to corpus and computational linguistics in terms of their functionality, but that are a prerequisite for many applications that will be encountered in this field. Examples are programming languages such as Java, Perl, Python and R as well as extensions to your operating system required by some software applications such as Tcl/Tk, GraphViz and others.


Java JDK for different OSs

Make sure you are downloading the correct Java version for your operating system. For example, there are versions for both 32-bit and 64-bit systems.


Python has become a very popular programming language among linguists, corpus and computational linguists in recent years. This success is partly due to the fact that it is available for free as in free of charge, but a crucial factor has also been the added utility for natural language processing which has been brought to Python by Bird et al.'s Natural Language Toolkit (NLTK). - The Python programming language

Python 2.7 (Python version that is compatible with the Natural Language Toolkit (NLTK)

Natural Language Toolkit (NLTK)


The Perl programming language is available from different sources. Perl is absolutely free.

Perl for Windows:

Perl for all and other operating systems


R is an extremely powerful statistics package and script programming language which is finding wide application in corpus and computational linguistics.

The R Project for Statistical Computing

Download R

Editors and environments for R programming

RStudio, a powerful interface for using R

TinnR (Windows editor for R)

StatEt - Eclipse plugin to support R in Eclipse

For more experienced users: Emacs, Notepad ++, UltraEdit or any other programming editor that supports syntax highlighting


Sometimes software packages require support for the development and deployment of graphical user interfaces by means of Tcl/Tk (pronounced tickle tee kay). Typically, the installation instructions or other specifications will tell you whether you need to install Tcl/Tk. If you do need it, you can download it from here:



Another package sometimes needed by software in order to support graphical interfaces and display is called GraphViz. Same as with Tcl/Tk, the installation instructions will usually tell you whether you need it.



Another open source graphics software that is very useful for corpus and computational linguistics and other fields seeking to visualize numerical data.