Multi-category classification of short dialogues is a common task performed by humans. When assigning a question to an expert, a customer service operator tries to classify the cu...
Concept location techniques are designed to help isolate sections of source code that relate to specific concepts. Blind Signal Separation techniques like Singular Value Decompos...
Probabilistic latent semantic indexing (PLSI) represents documents of a collection as mixture proportions of latent topics, which are learned from the collection by an expectation...
Alexander Hinneburg, Hans-Henning Gabriel, Andr&eg...
A key problem that arises when unstructured text is being queried is that of properly recognizing and exploiting geographical terms and entities. Here we describe a mechanism for ...
Yi Li, Alistair Moffat, Nicola Stokes, Lawrence Ca...
In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and K-Means clustering) and present how we integrated them in...