Automatic hypertext classification is an essential technique for organizing vast amount of Internet Web pages or HTML documents. One the of problems in classifying Web pages is tha...
We present a document analysis system able to assign logical labels and extract the reading order in a broad set of documents. All information sources, from geometric features and ...
Text classification categories Web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consumi...
We have developed an automatic DTB production platform [3], which is capable of flexibly generating different user interfaces for talking books. DTBs are built from digital copies ...
In this paper, we introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient min...