Social tagging is an increasingly popular way to describe and classify documents on the web. However, the quality of the tags varies considerably since the tags are authored freel...
Dynamic sentiment ambiguous adjectives (DSAAs) like "large, small, high, low"pose a challenging task on sentiment analysis. This paper proposes a knowledge-based method ...
Co-training is a semi-supervised technique that allows classifiers to learn with fewer labelled documents by taking advantage of the more abundant unclassified documents. However, ...
Despite ubiquitous claims that optical character recognition (OCR) is a "solved problem," many categories of documents continue to break modern OCR software such as docu...
This paper describes the development of algorithms for extracting the title and the names of the authors from documents available on the World Wide Web. In this paper we describe ...
Eric G. Berkowitz, Mohamed Reda Elkhadiri, Tim Sah...