We describe the WebCLEF 2008 task. Similarly to the 2007 edition of WebCLEF, the 2008 edition implements a multilingual "information synthesis" task, where, for a given t...
In this paper we describe preliminary work that examines whether statistical properties of the structure of websites can be an informative measure of their quality. We aim to deve...
Vaclav Petricek, Tobias Escher, Ingemar J. Cox, He...
This paper investigates the role of ontologies as a central part of an architecture to repurpose existing material from the web. A prototype system called ArtEquAKT is presented, ...
Mark J. Weal, Harith Alani, Sanghee Kim, Paul H. L...
Abstract. Governments often hold very rich data and whilst much of this information is published and available for re-use by others, it is often trapped by poor data structures, lo...
Harith Alani, David Dupplaw, John Sheridan, Kieron...
Many websites have large collections of pages generated dynamically from an underlying structured source like a database. The data of a category are typically encoded into similar...