PUBLICATIONS

Modeling common real-word relations using triples extracted from n-grams

Authors

Ruben Sipoš,

Dunja Mladenić,

Marko Grobelnik,

Publication date

2009

Publisher

Springer Berlin Heidelberg

Total citations

Cited by 6

Description

In this paper, we present an approach providing generalized relations for automatic ontology building based on frequent word n-grams. Using publicly available Google n-grams as our data source we can extract relations in form of triples and compute generalized and more abstract models. We propose an algorithm for building abstractions of the extracted triples using WordNet as background knowledge. We also present a novel approach to triple extraction using heuristics, which achieves notably better results than deep parsing applied on n-grams. This allows us to represent information gathered from the web as a set of triples modeling the common and frequent relations expressed in natural language. Our results have potential for usage in different settings including providing for a knowledge base for reasoning or simply as statistical data useful in improving understanding of natural languages.

Publication

PUBLICATIONS

Modeling common real-word relations using triples extracted from n-grams

OptimalAI