The effectiveness and scalability of MapReduce-based implementations of complex data-intensive tasks depend on an even redistribution of data between map and reduce tasks. In the...
Researchers maintain bibliographies and extensive sets of PDF files of scholarly publications on their desktop. The lack of proper metadata of downloaded PDFs makes this task a t...
We investigate temporal resolution of documents, such as determining the date of publication of a story based on its text. We describe and evaluate a model that build histograms e...
This paper presents Yagada, an algorithm to search labelled graphs for anomalies using both structural data and numeric attributes. Yagada is explained using several security-rela...
Michael Davis, Weiru Liu, Paul Miller, George Redp...
Various semi-supervised learning methods have been proposed recently to solve the long-standing shortage problem of manually labeled data in sentiment classification. However, mos...