In this paper we will describe Berkeley's approach to the Domain Specific (DS) track for CLEF 2008. Last year we used Entry Vocabulary Indexes and Thesaurus expansion approac...
The presence of replicas or near-replicas of documents is very common on the Web. Documents may be replicated completely or partially for different reasons (versions, mirrors, etc...
Ernesto Di Iorio, Michelangelo Diligenti, Marco Go...
This paper presents an intelligent Internet information system, Automatic Classifier for the Internet Resource Discovery (ACIRD), which uses machine learning techniques to organiz...
It is now accepted that the most eective video shot retrieval is based on indexing and retrieving clips using multiple, parallel modalities such as text-matching, image-matching a...
The deep web contains an order of magnitude more information than the surface web, but that information is hidden behind the web forms of a large number of web sites. Metasearch e...
Jeffrey P. Bigham, Ryan S. Kaminsky, Jeffrey Nicho...