Sjálfvirk greining merkingarvensla í Íslenskri orðabók

Authors

  • Anna Björk Nikulásdóttir Ruprecht Karls-Universität Heidelberg Author

Abstract

In the design of electronic dictionaries it is possible to organize the information due to the meaning of the lexems. In this article a method for automatic extraction of se- mantic relations from dictionary definitions is demonstrated. Definitions of all noun lexems of a monolingual Icelandic dictionary, Íslensk orðabók, were analyzed. First, the definitions were tagged with Brants TnT-tagger which had been trained on an Icelandic corpus. From the tagged data the POS-patterns of the definitions were ex- tracted and rules for extracting the semantic relations were developed. The rule al- gorithm was implemented in Smalltalk, resulting in the tool MERKOR. The results of the analyzis were promising. The test was made with a random set of lexemes, about 1,34% of the data. For each lexeme the result could be completely right, that is all semantic relations from the reference data was found in the analyzis of MERKOR, or it could be partly right, that is MERKOR did not find every relation compared to the reference data, but did nevertheless not extract any wrong word or relation. The pre- cision was 82,13% (completely right analyzed) up to 94,77% (completely right plus partly right).

Published

2020-07-25

Issue

Section

Non-refereed Short Papers