Text representation is a central task for any approach to automatic learning from texts. It requires a format which allows to interrelate texts even if they do not share content w...
Complex documents stored in a flat or partially marked up file format require layout sensitive preprocessing before any natural language processing can be carried out on their tex...
The paper reports on completed work aimed at the creation of a resource, namely, the Greek Textual Entailment Corpus (GTEC) that is appropriate for guiding training and evaluation...
Evi Marzelou, Maria Zourari, Voula Giouli, Stelios...
: Metadata++ is a digital library system that we are developing to serve the needs of the United States Department of Agriculture Forest Service, the United States Department of th...
Mathew Weaver, Lois M. L. Delcambre, Timothy Tolle
Some discourse structures such as enumerative structures have typographical, punctuational and laying out characteristics which (1) make them easily identifiable and (2) convey hi...