Authors
Jure Leskovec,
John Shawe-Taylor,
Publication date
2005
Publisher
Total citations
Description
We present a set of methods for creating a semantic representation from a collection of textual documents. Given a document collection we use a simple algorithm to connect the documents into a tree or a graph. Using the imposed topology we define a feature and document similarity measures. We use the kernel alignment to compare the quality of various similarity measures. Results show that the document similarity defined over the topology gives better alignment than standard cosine similarity measure on a bag of words document representation.