Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...
A number of real-world domains such as social networks and e-commerce involve heterogeneous data that describes relations between multiple classes of entities. Understanding the n...
Background: An important problem in genomics is the automatic inference of groups of homologous proteins from pairwise sequence similarities. Several approaches have been proposed...
The performance of many supervised and unsupervised learning algorithms is very sensitive to the choice of an appropriate distance metric. Previous work in metric learning and ada...
Background: Expressed sequence tags (ESTs) are single pass reads from randomly selected cDNA clones. They provide a highly cost-effective method to access and identify expressed g...
John Parkinson, David B. Guiliano, Mark L. Blaxter