We present the Private Digital Library (PDL) project that represents a service of the Corporate Digital Library (CDL) prototype. The main ideas underlying this project are the foll...
Giovanni Semeraro, Fabio Abbattista, Nicola Fanizz...
: The Web is huge, unstructured and diverse in quality, which makes searching for information difficult. In practice, few of the documents returned by a search engine are valuable ...
Despite the increase in email and other forms of digital communication, the use of printed documents continues to increase every year. Many types of printed documents need to be &...
Aravind K. Mikkilineni, Gazi N. Ali, Pei-Ju Chiang...
In traditional text classification, a classifier is built using labeled training documents of every class. This paper studies a different problem. Given a set P of documents of a ...
With the proliferation of heterogeneous devices (desktop computers, personal digital assistants, phones), multimedia documents must be played under various constraints (small scre...
The paper starts with a short overview on areas of application for user profiles. Subsequently a method to represent user profile in the field of document retrieval by using que...
Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform classif...
Many old manuscript documents were written on both sides of the paper, and the bleed-through from one side of the document to the other increases the difficulty in reading or deci...
This paper describes a new method for the classification of a HTML document into a hierarchy of categories. The hierarchy of categories is involved in all phases of automated docum...
Text categorization involves mapping of documents to a fixed set of labels. A similar but equally important problem is that of assigning labels to large corpora. With a deluge of ...