Character encoding

The concept of character encoding refers to the way operating systems encode the individual characters of an alphabet, numerical symbols, or indeed, any other type of grapheme-like symbol (punctuation marks, mathematical operators etc.). Character encoding is a notorious issue in natural language processing on computational systems, because the choice of encoding system determines - among other things - how natural language data is represented in the memory of a computational system and therefore how it is processed. And ultimately - also how it is represented on the screen, in print etc.