Hidden Web databases maintain a collection of specialised documents, which are dynamically generated in response to users' queries. The majority of these documents are genera...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
The empirical investigation of the effectiveness of information retrieval (IR) systems requires a test collection, a set of query topics, and a set of relevance judgments made by ...
Feature selection for unsupervised tasks is particularly challenging, especially when dealing with text data. The increase in online documents and email communication creates a nee...
Nirmalie Wiratunga, Robert Lothian, Stewart Massie
Indexes for large collections are often divided into shards that are distributed across multiple computers and searched in parallel to provide rapid interactive search. Typically,...
We address the problem of answering broad-topic queries on the World Wide Web. We present a link based analysis algorithm SelHITS, which is an improvement over Kleinberg's HI...