PUBLICATIONS

Automatic correction of topic coherence

Authors

William Martin,

John Shawe-Taylor,

Publication date

2012

Publisher

Total citations

Cited by 1

Description

A set of texts is often a poor representation of the language it is written in, and resultantly topics can seem nonsensical to domain experts. This can be for several reasons: misspellings or ‘accidental words’ can be given statistical significance in the case that too many topics are learned; words can appear related or unrelated in the text, even though the opposite is true in the language; too few topics or too many topics are used. In this position paper we present a novel approach by applying biases derived from external sources during the training process, in order to improve the coherence of topics. This has the effect of improving topic coherence [Newman et al., 2009, 2010], ironing out many of the issues that a sub-optimal number of topics can cause, and imbuing resultant models with real-world word-relationships.

Publication

PUBLICATIONS

Automatic correction of topic coherence

OptimalAI