Mörkun íslensks texta

  • Sigrún Helgadóttir Stofnun Árna Magnússonar í íslenskum fræðum
Keywords: part-of-speech tag, tagging, tagger

Abstract

Corpora are important linguistic resources. In this paper I describe the details of an Icelandic corpus, that is a part of a collection of corpora in 17 different languages which can be accessed online from http://corpora.informatik.uni-leipzig.de, and, with Icelandic instructions, at http://wortschatz.uni-leipzig.de/ws_ice/index.php. I discuss the possible usage of the corpus as a corpus based dictionary for non-linguistic users, and as a research tool for linguistic purposes in both applied and theoretical linguistics.  
Published
2020-07-26
Section
Non-refereed Short Papers