Text documents, in electronic and hardcopy forms, are and will probably remain the most widely used kind of content in our digital age. The goal of this paper is to overview proto...
To date many methods and programs for automatic text recognition exist. However there are no effective text recognition systems for graphic documents. Graphic documents usually con...
Alexander F. Gelbukh, Serguei Levachkine, Sang-Yon...
This paper presents a method of automatically creating hypermedia documents from conventional transcriptions of television programs. Using parallel text alignment techniques, the ...
—Text classification is a widely studied topic in the area of machine learning. A number of techniques have been developed to represent and classify text documents. Most of the t...
It is well known that anchor text plays a critical role in a variety of search tasks performed over hypertextual domains, including enterprise search, wiki search, and web search....
Donald Metzler, Jasmine Novak, Hang Cui, Srihari R...