Given a known protein sequence, predicting its secondary structure can help understand its three-dimensional (tertiary) structure, i.e., the folding. In this paper, we present an ...
— Clustering is a pivotal building block in many data mining applications and in machine learning in general. Most clustering algorithms in the literature pertain to off-line (or...
Steven Young, Itamar Arel, Thomas P. Karnowski, De...
Entity Resolution (ER) is an important real world problem that has attracted significant research interest over the past few years. It deals with determining which object descript...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...
In earlier work we have introduced and explored a variety of different probabilistic models for the problem of answering selectivity queries posed to large sparse binary data set...
If the dataset available to machine learning results from cluster sampling (e.g. patients from a sample of hospital wards), the usual cross-validation error rate estimate can lead...