Finding good representations of text documents is crucial in information retrieval and classification systems. Today the most popular document representation is based on a vector ...
The amount of information available in the MEDLINE database makes it very hard for a researcher to retrieve a reasonable amount of relevant documents using a simple query language ...
In recent years, active learning methods based on experimental design achieve state-of-the-art performance in text classification applications. Although these methods can exploit ...
Web pages are more than text and they contain much contextual and structural information, e.g., the title, the meta data, the anchor text, etc., each of which can be seen as a dat...
Text classification remains one of the major fields of research in natural language processing. This paper evaluates the use of the computational tool Coh-Metrix as a means to dis...
Scott A. Crossley, Philip M. McCarthy, Danielle S....