Abstract. In this paper, we describe a new unsupervised sentence boundary detection system and present a comparative study evaluating its performance against different systems foun...
Jan Strunk, Carlos Nascimento Silla Jr., Celso A. ...
: TextWise LLC. participated in the TREC-7 Cross-Language Retrieval track using the CINDOR system, which utilizes a "conceptual interlingua" representation of documents a...
Anne Diekema, Farhad Oroumchian, Paraic Sheridan, ...
Taking advantage of the well-known cluster hypothesis that “closely associated documents tend to be relevant to the same request”, we can use inter-document similarity to prov...
We discuss four different core protocols for synchronizing access to and modifications of XML document collections. These core protocols synchronize structure traversals and modif...
Sven Helmer, Carl-Christian Kanne, Guido Moerkotte
This paper presents a new representation and evaluation procedure of page segmentation algorithms and analyzes six widely-used layout analysis algorithms using the procedure. The ...