Abstract. The problem of mining sequential patterns was recently introduced in 3 . We are given a database of sequences, where each sequence is a list of transactions ordered by tr...
An important theoretical tool in machine learning is the bias/variance decomposition of the generalization error. It was introduced for the mean square error in [3]. The bias/vari...
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
The problem of analyzing microarray data became one of important topics in bioinformatics over the past several years, and different data mining techniques have been proposed for ...
Technology in the field of digital media generates huge amounts of non-textual information, audio, video, and images, along with more familiar textual information. The potential f...