Part-of-Speech tagging is an important preprocessing step for many applications in Natural Language Processing. This importance is reflected by many PoS tagger implementations available today. Which one do you use? Are you sure it is the most suited choice for your demands?
For choosing a PoS tagger there a two properties that should influence your choice:
Speed and Accuracy
Big Data scenarios shift speed stronger into the focus than in Digital Humanities where speed is often of minor importance.
Despite of the well known PoS tagger provided by Stanford or the TreeTagger, there are actually many more alternatives to them. Each implementation provides often more than just one model, which is the best?
We evaluated in total 27 models for 9 different PoS tagger implementations. The tagger implementation are listed below, we evaluated them on two languages, English and German.
In English, we evaluated each tagger model on the following corpora: British National Corpus, Brown, Gimpel, MASC, Switchboard. In German we evaluated on the Tüba-D/Z and Rehbein.
We excluded in English the Wall-Street-Journal and in German the Tiger and Negra corpus as many models have been trained with those corpora.
We evaluate for one language each corpus on each PoS tagger model and measure additionally the runtime of the PoS tagger for the tagging. The measuring starts before the tagger is called and ends right after it. Below figure shows the workflow of our experiment.
To overcome the differences in the tagsets of the various corpora, we harmonised the tags to a coarse grained tagset composing of eleven tags.
The samples highlighted in red are the ones showing the best speed/accuracy combination. The surprising winner is a rule-based Hepple tagger.
The most accurate German tagger is the TreeTagger. HunPos offers a reasonable trade-off between speed and accuracy. We currently do not have a rule-based tagger for German to test whether the results of the Hepple tagger transfer to German.
How to cite us?
Horsmann, Tobias; Erbs, Nicolai; Zesch, Torsten (2015): Fast or Accurate ? – A Comparative Evaluation of PoS Tagging Models. Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology (GSCL-2015), Essen, Germany.