Existing sequence mining algorithms mostly focus on mining for subsequences. However, a large class of applications, such as biological DNA and protein motif mining, require effici...
Classification is an important data mining problem. Given a training database of records, each tagged with a class label, the goal of classification is to build a concise model ...
Johannes Gehrke, Venkatesh Ganti, Raghu Ramakrishn...
Many organizations today have more than very large databases; they have databases that grow without limit at a rate of several million records per day. Mining these continuous dat...
The mining of frequent sequential patterns has been a hot and well studied area—under the broad umbrella of research known as KDD (Knowledge Discovery and Data Mining)— for we...
We study a new data mining problem concerning the discovery of frequent agreement subtrees (FASTs) from a set of phylogenetic trees. A phylogenetic tree, or phylogeny, is an unorde...