The increasing amount of available textual information makes necessary the use of Natural Language Processing (NLP) tools. These tools have to be used on large collections of docu...
Classification of documents by genre is typically done either using linguistic analysis or term frequency based techniques. The former provides better classification accuracy than...
Retrieving data based not only on key words is a challenge. We worked on semi-structured data (cultural heritage corpora). Our project aimed at getting the most relevant text-unit...
Julien Lesbegueries, Christian Sallaberry, Mauro G...
Digital Libraries will hold huge amounts of text and other forms of information. For the collections to be maximally useful, they must be highly organized with useful indexes and ...
Robert P. Futrelle, Xiaolan Zhang 0002, Yumiko Sek...