Authors
Miha Grcar,
Dunja Mladenic,
Marko Grobelnik,
Publication date
Publisher
Total citations
Cited by
Description
In this paper, we present our experience in applying collaborative filtering to real-life corporate data in the light of data quality and sparsity. The quality of collaborative filtering recommendations is highly dependent on the quality of the data used to identify users’ preferences. To understand the influence that highly sparse server-side collected data has on the accuracy of collaborative filtering, we ran a series of experiments in which we used publicly available datasets and, on the other hand, a real-life corporate dataset that does not fit the profile of ideal data for collaborative filtering. We have also experimentally compared two standard distance measures (Pearson correlation and Cosine similarity) used by k-Nearest Neighbor classifier, showing that depending on the dataset one outperforms the other-but no consistent difference can be claimed.