Never before have so many information sources been available. Most are accessible on-line and some exist on the Internet alone. However, this large information quantity makes inte...
Web search logs contain extremely sensitive data, as evidenced by the recent AOL incident. However, storing and analyzing search logs can be very useful for many purposes (i.e. in...
As the popularity of the World Wide Web increases, the amount of traffic results in major congestion problems for the retrieval of data over wide distances. To react to this, user...
This paper describes an approach to providing lexical information for natural language processing in unrestricted domains. A system of approximately 1200 morphological rules is us...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...