Sciweavers

295 search results - page 23 / 59
» Web Crawling
Sort
View
WWW
2006
ACM
14 years 10 months ago
Status of the African Web
As part of the Language Observatory Project [4], we have been crawling all the web space since 2004. We have collected terabytes of data mostly from Asian and African ccTLDs. In t...
Rizza Camus Caminero, Pavol Zavarsky, Yoshiki Mika...
ASWC
2006
Springer
14 years 1 months ago
Next Generation Semantic Web Applications
Watson is a gateway to the Semantic Web: it collects, analyzes and gives access to ontologies and semantic data available online. Its objective is to support the development of ne...
Enrico Motta, Marta Sabou
NIPS
2000
13 years 11 months ago
The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity
We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is...
David A. Cohn, Thomas Hofmann
ICWE
2005
Springer
14 years 3 months ago
Identifying Websites with Flow Simulation
We present in this paper a method to discover the set of webpages contained in a logical website, based on the link structure of the Web graph. Such a method is useful in the conte...
Pierre Senellart
JWSR
2007
172views more  JWSR 2007»
13 years 9 months ago
Service Class Driven Dynamic Data Source Discovery with DynaBot
: Dynamic Web data sources – sometimes known collectively as the Deep Web – increase the utility of the Web by providing intuitive access to data repositories anywhere that Web...
Daniel Rocco, James Caverlee, Ling Liu, Terence Cr...