—In this paper we present a scalable and distributed system for image retrieval based on visual features and annotated text. This system is the core of the SAPIR project. Its arc...
We have developed a set of methods and tools for automatic discovery of putative regulatory signals in genome sequences. The analysis pipeline consists of gene expression data clu...
Jaak Vilo, Alvis Brazma, Inge Jonassen, Alan J. Ro...
Traditionally, statistical machine translation systems have relied on parallel bi-lingual data to train a translation model. While bi-lingual parallel data are expensive to genera...
Matthew G. Snover, Bonnie J. Dorr, Richard M. Schw...
We propose two hashing-based solutions to the problem of fast and effective personal names spelling correction in People Search applications. The key idea behind our methods is to...
An unsupervised clustering of the webpages on a website is a primary requirement for most wrapper induction and automated data extraction methods. Since page content can vary dras...