PUBLICATIONS

Data Quality and Sparsity Issues in Collaborative Filtering on Web Logs

Authors

Miha Grcar,

Dunja Mladenic,

Marko Grobelnik,

Publication date

Publisher

Total citations

Cited by

Description

In this paper, we present our experience in applying collaborative filtering to real-life corporate data in the light of data quality and sparsity. The quality of collaborative filtering recommendations is highly dependent on the quality of the data used to identify users’ preferences. To understand the influence that highly sparse server-side collected data has on the accuracy of collaborative filtering, we ran a series of experiments in which we used publicly available datasets and, on the other hand, a real-life corporate dataset that does not fit the profile of ideal data for collaborative filtering. We have also experimentally compared two standard distance measures (Pearson correlation and Cosine similarity) used by k-Nearest Neighbor classifier, showing that depending on the dataset one outperforms the other-but no consistent difference can be claimed.

Publication

PUBLICATIONS

Data Quality and Sparsity Issues in Collaborative Filtering on Web Logs

OptimalAI