Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...
Recently answers for fact lookup queries have appeared on major search engines. For example, for the query {Barack Obama date of birth} Google directly shows “4 August 1961” a...
This paper describes the lifecycle of a digital historical document, from template-based structure definition through to content extraction from the scanned pages and its final re...
Many real–world information needs are naturally formulated as queries with temporal constraints. However, the structured temporal background information needed to support such c...
Steven Schockaert, Martine De Cock, Etienne E. Ker...
As originally conceived, the World Wide Web was intended for the purpose of sharing information. Many websites realise this aim by publishing pages from a data repository which su...