Many applications in text processing require significant human effort for either labeling large document collections (when learning statistical models) or extrapolating rules from...
We investigate three methods for defining a session on Web search engines. We examine 2,465,145 interactions from 534,507 Web searchers. We compare defining sessions using: 1) Int...
Knowledge Discovery in time series usually requires symbolic time series. Many discretization methods that convert numeric time series to symbolic time series ignore the temporal ...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Surveillance systems have long been used to monitor industrial processes and are becoming increasingly popular in public health and anti-terrorism applications. Most early detecti...