In this report we describe the approach of the University of Twente to the 2006 GeoCLEF task. It is based on retrieval by content and the subsequent filtering by geographical rele...
Advances in digital technology for the graphic and textual representation of manuscripts have not, until recently, been applied to the worldʼs oldest manuscripts, cuneiform table...
Jonathan D. Cohen, Donald Duncan, Dean Snyder, Jer...
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
The cultural heritage domain dealing with digital surrogates of rare and fragile historic artifacts is one of the most promising areas for establishing collaboratories, i.e. shared...
We describe an unsupervised learning algorithm for extracting sparse and locally shift-invariant features. We also devise a principled procedure for learning hierarchies of invari...