Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...
This paper describes experiments in the automatic construction of lexicons that would be useful in searching large document collections for text fragments that address a specific ...
—Most Web and legacy paper-based documents are available in human comprehensible text form, not readily accessible to or understood by computer programs. Here, we investigate an ...
Given a large collection of medical images of several conditions and treatments, how can we succinctly describe the characteristics of each setting? For example, given a large col...
This paper presents our work on automatically locating charts from document pages, which is an important stage in the chart image recognition and understanding system being develo...