Opinion Mining

Opinion Mining is a subdiscipline of computational linguistics and data mining. It deals with the automatic extraction and classification of opinions. This task is also known as sentiment analysis or Appraisal Theory. In opinion mining the used methods might be more automatically as in sentiment analysis. In sentiment analysis the method to identify subjectivity is probably more based on features gained by natural language processing techniques. The appraisal theory is a linguistic method to extract opinions. It is an extension of the linguistic theories of Halliday and is based in Systemic Functional Linguistics (cf. Bloom and Argamon 2010, 250). Thereby, appraisal denotes how language is used to express an attitude towards a target (cf. Whitelaw et al. 2005, 626). Consequently, appraisal expressions are relevant for opinion mining.

Sentiment Classification

Opinion mining includes two classification tasks:

  1. the classification between objective and subjective elements in a text
  2. the classification between positive and negative opinions

Furthermore, sentiment classification can be done on different levels such as on the document, sentence or phrase/word level. Thereby, different machine learning approaches can be used such as unsupervised, supervised and semi supervised learning. In computational linguistics, the supervised approach is mostly used as it allows linguists to include linguistic features such as POS tags or phrase patterns. Other possible features might be specific occurences of words or punctuation.

There exist a few opinion annotated corpora that can be used to train a classifier. For English, there exist the Cornell movie-review corpus with sentiment-classified movie reviews or the sentiment-classified news corpus MPQA. For German, the MLSA is a multi-layered reference corpus for German-language sentiment analysis and the German part of the Subjectivity in News Corpus (SNC) is publicly available. The dataset for the MLSA is available under https://sites.google.com/site/iggsahome/downloads and for the SNC under http://130.149.154.91/corpus/snc/SNC.de.zip.

References