Authors
Blaž Fortuna,
Nello Cristianini,
John Shawe-Taylor,
Publication date
2007
Publisher
IGI Global
Total citations
Description
We present a general method using kernel canonical correlation analysis (KCCA) to learn a semantic of text from an aligned multilingual collection of text documents. The semantic space provides a language-independent representation of text and enables a comparison between the text documents from different languages. In experiments, we apply the KCCA to the cross-lingual retrieval of text documents, where the text query is written in only one language, and to cross-lingual text categorization, where we trained a cross-lingual classifier.