Many important problems involve clustering large datasets. Although naive implementations of clustering are computationally expensive, there are established efficient techniques f...
Millions of DNA sequences (reads) are generated by Next Generation Sequencing machines everyday. There is a need for high performance algorithms to map these sequences to the refer...
Clustering is ill-defined. Unlike supervised learning where labels lead to crisp performance criteria such as accuracy and squared error, clustering quality depends on how the cl...
Rich Caruana, Mohamed Farid Elhawary, Nam Nguyen, ...
When comparing inductive logic programming (ILP) and attribute-value learning techniques, there is a trade-off between expressive power and efficiency. Inductive logic programming ...
Hendrik Blockeel, Luc De Raedt, Nico Jacobs, Bart ...
Summarization is an important task in data mining. A major challenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic q...