Sciweavers

611 search results - page 38 / 123
» Random web crawls
Sort
View
WWW
2006
ACM
14 years 10 months ago
Detecting nepotistic links by language model disagreement
In this short note we demonstrate the applicability of hyperlink downweighting by means of language model disagreement. The method filters out hyperlinks with no relevance to the ...
András A. Benczúr, István B&i...
WWW
2003
ACM
14 years 10 months ago
Dynamic maintenance of web indexes using landmarks
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
WWW
2003
ACM
14 years 10 months ago
Monitoring the dynamic web to respond to continuous queries
Continuous queries are queries for which responses given to users must be continuously updated, as the sources of interest get updated. Such queries occur, for instance, during on...
Sandeep Pandey, Krithi Ramamritham, Soumen Chakrab...
WWW
2007
ACM
14 years 10 months ago
Towards Deeper Understanding of the Search Interfaces of the Deep Web
Many databases have become Web-accessible through form-based search interfaces (i.e., HTML forms) that allow users to specify complex and precise queries to access the underlying ...
Hai He, Weiyi Meng, Yiyao Lu, Clement T. Yu, Zongh...
SIGMOD
2000
ACM
85views Database» more  SIGMOD 2000»
14 years 2 months ago
Finding Replicated Web Collections
Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times....
Junghoo Cho, Narayanan Shivakumar, Hector Garcia-M...