The Windows Shell (aka the command line)

The Windows Shell, also known as the Command Line (German: Eingabeaufforderung), provides access to commands and tools that are not accessible via the graphical user interface, but have to be called by means of command line parameters. Many professional tools in linguistics have to be called from the command line. It is thus very important for students of linguistics to familiarize themselves with the command line. In this tutorial, we are introducing the basic functionality of the command line that is necessary for navigation in the file system, for creating and deleting files and dircetories and for data inspection and manipulation without a graphical user interface. In a second step, we are providing some examples of useful tools that are becoming accessible through the command line. We are hoping to encourage users who are chiefly used to operating a computer via the functionality of the graphical user interface (GUI) to use the command line proficiently and to their advantage.

Calling the Shell from the Windows GUI

There are different ways of cranking up the shell from the Windows GUI. We are introducing the two most important ones here to get you started:

1. Calling the Shell from the Program menu

The simplest way of calling the command line or shell is via the Program menu.

Click on the START button - click on Programs - navigate to Accessories and then to the Command line icon which you click. This opens up a windows with a black background and white characters. This is the shell or command line. When first opened, it shows some text including, at the very top, information on the version of Microsoft Windows you are running on your computer. The cursor is blinking behind a chain of characters called the command prompt, short prompt. It tells you where in the file system you are currently working, the so-called working directory. On my computer, it looks like this:

C:\Users\bartsch\

2. Calling the Shell from the Search ... dialogue

Users can also call the shell from the Search … (Programme/Dateien durchsuchen …) dialogue in the bottom left corner of the menu that opens up when they click the Start button in Windows. Type the three letters cmd (short for command line) into that search box and the command line Window shown above opens up.

These two ways of calling the command line from the Windows GUI are entirely equal, you can choose which approach suits your preferences.

Users may find the second approach very convenient because it also allows you to call up other programmes from the keyboard. This is especially appealing for keyboard focused people like myself. Try it by just typing the name of a programme into the box and see what happens …. As an example, type “Excel” and Windows will navigate you to the Excel icon that allows you to open up Microsoft Excel. If there ist more than one program with the letter sequence you typed in, Windows will show you all potential options of programs with that letter sequence in the name on the menu, and only these. You can verify this letter option by typing in “Word”, very likely, Windows will offer you Microsoft Word (if installed on your machine) and WordPad, one of the standard editors offered as a default by Microsoft Windows.

3. Some basic commands

One of the most important functions you will need to use on the command line is navigation between directories. Here are some of the options in tabular form:

command semantics prose description example
cd change directory cd /temp
cd .. change one directory down moves one directory down in the directory tree
cd / change to root (c:\) change directory down to the root directory of the drive (e.g. to c:\) cd /
md make directory creates a new directory under the current working directory md /temp
rd remove directory removes the specified directory rd /temp

4. Usability functions

4.1 Auto completion

A nice and often overlooked feature of the Windows command line is called *Auto completion*. When you're typing in long file or directory names, you will find that you often make typing mistakes and every time you make a mistake, your command will fail because the operating system has no clue what you mean. This is one of the most frustrating experiences for novice command line users on any operating system as well as for programming apprentices. However, the developers of operating systems offer you some help. When you start typing in a command, you can hit the tabulator button after the first few letters and the system will complete the character sequence you started with a completion of either file or directory names that are available in the present working directory. The best way to explore this function is to try it for yourself. Open up the command line and you'll be in the default working directory of your user. On my computer that is:

c:\Users\bartsch

Now type in cd Do

and hit the tabulator button and Windows completes this with

Documents

which is your folder Documents

or Dokumente

Hit the return button and the system changes into that directory.

Auto completion also works for file names.

4.2 Command recovery

Many commands are regularly reused and have to be typed in many times in one session. cd for change directory is one such example, but also calls for software programs often have to be repeated. Instead of retyping a command, the command line offers another convenience function called command recovery. If your command line window is still open and you have already typed in a few commands from the exercises above, just hit the up arrow button (usually located somewhere in the bottom right hand corner of your key board between the main alphabetical block and the number block) and see what happens. Hit it a few times and you will see that the system keeps in memory all previous commands that you have typed in one session; you can recover them in sequence. You can even recover a previous command and alter it, for example to work on a different file. This function is especially useful when you are carrying out repetitive tasks that only require minor alternations.

5. Output handling

Most command line tools will by default print their output to the screen unless they are “told” to do otherwise. This is a very useful default behaviour because it allows you to instantly inspect whether a tool is performing the desired function and whether the data look like they have undergone the desired processing procedure. However, most of the time, when a piece of software is behaving in the desired fashion, we want to capture the output for further processing or for query and analysis. So a function that allows us to capture the output in a file instead of the screen would be very welcome. Fortunately, operating systems offer this functionality out of the box. It is typically evoked in a fashion that basically specified the processing function to be carried out, optional parameters influencing the processing (e.g. a choice of a modell for part of speech tagging or parsing), the name of the input file, a signal telling the operating system to pipe the output to a file and the name of the output file which can be specified by the user:

Under the Windows shell, the pipe symbol is >, it takes as its arguments an input file-name before the symbol and an output file-name after the symbol. This writes the output to a file in the same directory as the input file unless a different path is specified.

Examples of running this type of procedure can be viewed in the following tutorials:

6. Shell settings

For the general user of modern GUI-based operating systems, the shell normally hardly ever plays a role except in order to occasionally pop up and display systems messages during certain installation steps. As a linguist, however, you frequently come across tools that require command line interaction. Unfortunately, the default Windows shell settings are not geared towards the professional user and especially not towards users working in multilingual settings. An effect of this is that the shell does not support UTF-8 encodings of textual data. The leads to the unfortunate effect that when you call up the TreeTagger from the command line and have its output written to the display for languages with special characters even as simple as German Umlauts or ß, these to not display correctly on the screen. This is not, however, due to a malfunction of the TreeTagger, but an effect of the interaction between shell and UTF-8 encoded data. The issue can be resolved by writing the output directly to a file, but sometimes one wants to just inspect initial output on the screen and for such cases you can make some modifications to the shell settings in order to fix these issues.

6.1 Shell settings: Fonts settings

In order to achieve the desired effect, first of all change the standard font used by the Windows shell aka Command Prompt as well as the Windows PowerShell.

In order to ensure that all characters are correctly represented in a Windows shell, you need to ensure that the font used to display the output is a True Type Font (ttf) and that it is capable of representing all Unicode characters. Avoid the so-called Raster Fonts, esp. on high resolution screens as they may not display clearly. A good choice that displays very legibly also for teaching purposes is the TTFont Lucida Console. There may not be any need to change these settings on your system, but if you find that special characters are not displaying correctly, experiment with the font settings in the Windows shell.

6.2 Shell settings: Code page

If the standard output (screen) of annotation tools such as the TreeTagger leads to faulty character representations, e.g. is Umlauts and other characters with diacritics, consider temporarily changing the code page. This is achieved by typing the following after opening the shell and before executing the TreeTagger:

chcp 65001

If this solves your problem, consider adding this line to the beginning of the batch file driving the automatic annotation, e.g. if Umlauts are not correctly displayed in Tree Tagger output.

… to be continued shortly …