Despite the widespread use of BM25, there have been few studies examining its effectiveness on a document description over single and multiple field combinations. We determine t...
Table is a commonly used presentation scheme, especially for describing relational information. However, table understanding remains an open problem. In this paper, we consider th...
In this paper we present an integrated approach for semantic structure extraction in document images. Document images are initially processed to extract both their layout and logic...
Efficiently finding Web services on the Web is a challenging issue in service-oriented computing. Currently, UDDI is a standard for publishing and discovery of Web services, and U...
Identifying the most influential documents in a corpus is an important problem in many fields, from information science and historiography to text summarization and news aggregati...