Discussion papers

CPM-16-229 - 22 September 2016

Shaheen has paper at IC3K on "Bootstrapping a Semantic Lexicon on Verb Similarities"

Bootstrapping a Semantic Lexicon on Verb Similarities by Shaheen Syed, Marco Spruit and Melania Borit @ IC3K 2016

Bootstrapping a Semantic Lexicon on Verb Similarities
by Shaheen Syed, Marco Spruit and Melania Borit

Keywords: Semantic Lexicon, Bootstrapping, Extraction Patterns, Web Mining

Abstract: We present a bootstrapping algorithm to create a semantic lexicon from a list of seed words and a corpus that was mined from the web. We exploit extraction patterns to bootstrap the lexicon and use collocation statistics to dynamically score new lexicon entries. Extraction patterns are subsequently scored by calculating the conditional probability in relation to a non-related text corpus. We find that verbs that are highly domain related achieved the highest accuracy and collocation statistics affect the accuracy positively and negatively during the bootstrapping runs.

(KDIR is part of IC3K, the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2016 will be held in conjunction with IJCCI 2016. The purpose of the IC3K is to bring together researchers, engineers and practitioners on the areas of Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K is composed of three co-located conferences, each specialized in at least one of the aforementioned main knowledge areas.)

Downloads