Authors
Blaz Fortuna,
John Shawe-Taylor,
Publication date
2005
Publisher
Total citations
Description
Eigen-analysis such as LSI or KCCA was already successfully applied to cross-lingual information retrieval. This approach has a weakness in that it needs an aligned training set of documents. In this paper we address this weakness and show that it can be successfully avoided through the use of machine translation. We show that the performance is similar on the domains where human generated training seta are available. However for other domains artificial training sets can be generated that significantly outperform human generated ones obtained from a different domain.