Automatically segmenting unstructured text strings into structured records is necessary for importing the information contained in legacy sources and text collections into a data ...
Click data captures many users’ document preferences for a query and has been shown to help significantly improve search engine ranking. However, most click data is noisy and of...
In distributed environments, statements from a number of principals, besides the central trusted party, may influence the derivations of authorization decisions. However, existin...
We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
We describe a novel simple and highly scalable semi-supervised method called Word-Class Distribution Learning (WCDL), and apply it the task of information extraction (IE) by utili...
Yanjun Qi, Ronan Collobert, Pavel Kuksa, Koray Kav...