We make two main contributions in this paper. First, we motivate and introduce a novel class of data mining problems that arise in labeling a group of mass spectra, specifically f...
We study the fundamental problem of computing distances between nodes in large graphs such as the web graph and social networks. Our objective is to be able to answer distance que...
Atish Das Sarma, Sreenivas Gollapudi, Marc Najork,...
Temporal expressions, such as between 1992 and 2000, are frequent across many kinds of documents. Text retrieval, though, treats them as common terms, thus ignoring their inherent...
Irem Arikan, Srikanta J. Bedathur, Klaus Berberich
Abstract. Crawling the deep web often requires the selection of an appropriate set of queries so that they can cover most of the documents in the data source with low cost. This ca...
This paper presents an interdisciplinary investigation of statistical information retrieval (IR) techniques for protein identification from tandem mass spectra, a challenging probl...