Creating and editing source code are tedious and error-prone processes. One important source of errors in editing programs is the failure to correctly adapt a block of copied code...
Category ranking provides a way to classify plain text documents into a pre-determined set of categories. This work proposes to have a look at typical document collections and ana...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
Background: The rapid growth of biomedical literature presents challenges for automatic text processing, and one of the challenges is abbreviation identification. The presence of ...
Sunghwan Sohn, Donald C. Comeau, Won Kim, W. John ...
Currently an abundance of historical manuscripts, journals, and scientific notes remain largely unaccessible in library archives. Manual transcription and publication of such docu...