Search engines are the primary gateways of information access on the Web today. Behind the scenes, search engines crawl the Web to populate a local indexed repository of Web pages...
Query-independent features (also called document priors), such as the number of incoming links to a document, its Page-Rank, or the type of its associated URL, have been successfu...
Background: Sequence comparison is one of the most prominent tools in biological research, and is instrumental in studying gene function and evolution. The rapid development of hi...
Tomer Shlomi, Daniel Segal, Eytan Ruppin, Roded Sh...
In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging ...
Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei ...
The monitoring of packets destined for reachable, yet unused, Internet addresses has proven to be a useful technique for measuring a variety of specific Internet phenomenon (e.g.,...
Eric Wustrow, Manish Karir, Michael Bailey, Farnam...