Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...
Background: Protein-protein interactions play a key role in biological processes of proteins within a cell. Recent high-throughput techniques have generated protein-protein intera...
Much work on skewed, stochastic, high dimensional, and biased datasets usually implicitly solve each problem separately. Recently, we have been approached by Texas Commission on En...
In this paper we study supervised and semi-supervised classification of e-mails. We consider two tasks: filing e-mails into folders and spam e-mail filtering. Firstly, in a sup...
Irena Koprinska, Josiah Poon, James Clark, Jason C...
Estimating distances in the Internet has been studied in the recent years due to its ability to improve the performance of many applications, e.g., in the peer-topeer realm. One sc...