We propose a new unsupervised method for topic detection that automatically identifies the different facets of an event. We use pointwise Kullback-Leibler divergence along with the Jaccard coefficient to build a topic graph which represents the community structure of the different facets. The problem is formulated as a weighted set cover problem with dynamically varying weights. The algorithm is domainindependent and generates a representative set of informative and discriminative phrases that cover the entire event. We evaluate this algorithm on a large collection of blog postings about different news events and report promising results.
Pradeep Muthukrishnan, Joshua Gerrish, Dragomir R.