In this paper we consider the problem of building a system to predict readability of natural-language documents. Our system is trained using diverse features based on syntax and l...
Rohit J. Kate, Xiaoqiang Luo, Siddharth Patwardhan...
Abstract. User generated content in general, and blogs in particular, form an interesting and relatively little explored domain for mining knowledge. We address the task of blog di...
Wouter Weerkamp, Krisztian Balog, Maarten de Rijke
Class syntax can be used to 1) model temporal or locational evolvement of class labels of feature observation sequences, 2) correct classification errors of static classifiers if ...
This paper proposes a new approach for classifying text documents into two disjoint classes. The new approach is based on extracting patterns, in the form of two logical expressio...
Documents and authors can be clustered into “knowledge communities” based on the overlap in the papers they cite. We introduce a new clustering algorithm, Streemer, which fin...
Vasileios Kandylas, S. Phineas Upham, Lyle H. Unga...