EuroGOV is a multilingual web corpus that was created to serve as the document collection for WebCLEF, the CLEF 2005 web retrieval task. EuroGOV is a collection of web pages crawl...
In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized sys...
Ricardo A. Baeza-Yates, Carlos Castillo, Flavio Ju...
In this short note we present a recommendation system for automatic retrieval of broken Web links using an approach based on contextual information. We extract information from th...
Abstract. This paper describes the new e-learning tool CHESt that allows students to search in a knowledge base for short (teaching) multimedia clips by using a semantic search eng...
The web hasgreatly improved accessto scientific literature. However, scientific articles on the web are largely disorganized, with research articles being spreadacrossarchive site...