The induction of knowledge from a data set relies in the execution of multiple data mining actions: to apply filters to clean and select the data, to train different algorithms (...
An unsupervised classification algorithm is derived by modeling observed data as a mixture of several mutually exclusive classes that are each described by linear combinations of i...
This paper presents a method for automatically generating an association thesaurus from a text corpus, and demonstrates its application to information retrieval. The thesaurus gen...
Identifying groups of Internet hosts with a similar behavior is very useful for many applications of Internet security control, such as DDoS defense, worm and virus detection, dete...
Anticipating the availability of large questionanswer datasets, we propose a principled, datadriven Instance-Based approach to Question Answering. Most question answering systems ...