This is an old revision of the document!


Lemmatization

Closely related to the identification of parts of speech in a corpus is the process of lemmatization. It involves the reduction of inflectional variants of the words to the respective lemmas or lexemes. It is mostly used as a corpus annotation in works of digital lexicography or vocabulary , where words like doing , done , does are reduced to their lemma **do_.