Authors
David Baxter,
Bryan Klimt,
Marko Grobelnik,
Michael Witbrock,
Michael Witbrock,
Publication date
2009
Publisher
Springer Berlin Heidelberg
Total citations
Description
When dealing with a document collection, it is important to identify repeated information. In multi-document summarization, for example, it is important to retain widely repeated content, even if the wording is not exactly the same. Simplistic approaches simply look for the same strings, or the same syntactic structures (including words), across documents. Here we investigate semantic matching, applying background knowledge from a large, general knowledge base (KB) to identify such repeated information in texts. Automatic document summarization is the problem of creating a surrogate for a document that adequately represents its full content. Automatic ontology generation requires information about candidate types, roles and relationships gathered from across a document or document collection. We aim at a summarization system that can replicate the quality of summaries created by humans and ontology …