This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
The design, development, and use of complex systems models raises a unique class of challenges and potential pitfalls, many of which are commonly recurring problems. Over time, res...
Background: In proteomic analysis, MS/MS spectra acquired by mass spectrometer are assigned to peptides by database searching algorithms such as SEQUEST. The assignations of pepti...
Large scale learning is often realistic only in a semi-supervised setting where a small set of labeled examples is available together with a large collection of unlabeled data. In...
Accurate topical categorization of user queries allows for increased effectiveness, efficiency, and revenue potential in general-purpose web search systems. Such categorization be...
Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, ...