The computation of page importance in a huge dynamic graph has recently attracted a lot of attention because of the web. Page importance, or page rank is defined as the fixpoint o...
An increasing number of applications operate on data obtained from the Web. These applications typically maintain local copies of the web data to avoid network latency in data acc...
This paper focuses on a theory of tense and aspect (the representation of time in natural language) that attempts a formM representation of the relevant liltguistic devices as imp...
As part of the Language Observatory Project [4], we have been crawling all the web space since 2004. We have collected terabytes of data mostly from Asian and African ccTLDs. In t...
Rizza Camus Caminero, Pavol Zavarsky, Yoshiki Mika...
We present the design of Dynabot, a guided Deep Web discovery system. Dynabot's modular architecture supports focused crawling of the Deep Web with an emphasis on matching, p...
Daniel Rocco, James Caverlee, Ling Liu, Terence Cr...