MSc project- Örvar Kárason

Title: Data-driven Part-of-Speech Taggers for Icelandic: Comparison and Error Analysis. The defense will be streamed on Zoom:

  • 31.3.2023, 10:30 - 11:30, Í HR og í streymi


Part-of-Speech (POS) tagging is a sequential labelling task in which words, punctuation, and symbols occurring in running text, i.e., tokens are assigned a tag describing their morphosyntactic features.

To predict the correct tag, the tagger relies on the context of the token in a sentence and its orthographic form. POS tagging is an important step for many Natural Language Processing applications.

Over the last two decades, steady progress has been made in POS tagging for Icelandic. Various taggers have been presented that improve on previous state-of-the-art methods. During that period, work on Icelandic Corpora has also progressed. Existing corpora have undergone error correction phases, and in some cases been expanded with new data. A new larger gold standard corpus for POS tagging was created to replace the older standard. Furthermore, alterations have been made to the fine-grained tagset used for Icelandic -- it has been simplified a couple of times with tags being removed or merged into others, and new tags have been added.

This variability over the years means that reported results for taggers are not easily comparable. In this project, we train and test four data-driven taggers that have been employed for Icelandic, while using the latest version of the current gold standard corpus and tagset, as well as the latest versions of augmentation data used, if any. These taggers represent four different models: a Hidden Markov model, an Averaged Perceptron algorithm, a Bidirectional Long Short-Term Memory neural network, and a transformer neural network. We compare the accuracy of the four models and see from where each model's improvements stem. We also do an error analysis of the results of the transformer model, which obtains the highest accuracy.

Now that the latest tagging method based on the transformer model is surpassing 97% accuracy one might question if further gains can be achieved. The generally considered upper bound of inter-annotator agreement for morphosyntactic analysis is between 97\% and 98\%. Is POS tagging now perhaps a solved problem for Icelandic? We draw a random sample of errors common to all four models for classification with regard to insolubility. This analysis reveals annotation errors in the gold standard corpus as well as insoluble tagging errors due to insufficient context information. We calculate the lower bounds for these error classes and estimate that, by correcting the annotation errors in the gold standard and making some improvements to the model, the accuracy could surpass 98%

The committee members are the following:


Hrafn Loftsson, supervisor, Associate Professor at RU

Anna Sigríður Islind, Associate Professor at RU

Stefán Ólafsson, Assistant Professor at RU

Vinsamlegast athugið að á viðburðum Háskólans í Reykjavík (HR) eru teknar ljósmyndir og myndbönd sem notuð eru í markaðsstarfi HR. Hægt er að nálgast frekari upplýsingar á eða með því að senda tölvupóst á netfangið:
Please note that at events hosted at Reykjavik University (RU), photographs and videos are taken which might be used for RU marketing purposes. Read more about this on our or send an e-mail: