The non-English Web is growing at breakneck speed, but available language processing tools are mostly English based. Taxonomies are a case in point: while there are plenty of comm...
Xuerui Wang, Andrei Z. Broder, Evgeniy Gabrilovich...
A website can regulate search engine crawler access to its content using the robots exclusion protocol, specified in its robots.txt file. The rules in the protocol enable the site...
Term weighting scheme, which has been used to convert the documents as vectors in the term space, is a vital step in automatic text categorization. In this paper, we conducted com...
Man Lan, Chew Lim Tan, Hwee-Boon Low, Sam Yuan Sun...
Search engines return ranked lists of Web pages in response to queries. These pages are starting points for post-query navigation, but may be insufficient for search tasks involvi...
Experimental performance studies on computer systems, including Grids, require deep understandings on their workload characteristics. The need arises from two important and closel...