PUBLICATIONS

Predicting content from hyperlinks

Authors

Dunja Mladenic,

Marko Grobelnik,

Publication date

1999

Publisher

Total citations

Cited by 13

Description

This paper describes an approach to prediction of a document content based on the hyperlink that points to the document. The k-Nearest Neighbor algorithm is used to predict a set of words that appear in the document. Experiments are performed on realworld data obtained from the Web. The proposed approach gives promising results. On the tested data in average about 33% of document words are correctly predicted while among all the predicted words about 15% appeared in the document. The predicted words are chosen from about 4,000 to 8,000 dierent words and word pairs that appear in the training examples.

Publication

PUBLICATIONS

Predicting content from hyperlinks

OptimalAI