Sciweavers

563 search results - page 9 / 113
» Crawling the web for structured documents
Sort
View
PVLDB
2008
141views more  PVLDB 2008»
13 years 9 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
CIKM
2004
Springer
14 years 3 months ago
Node ranking in labeled directed graphs
Our work is motivated by the problem of ranking hyperlinked documents for a given query. Given an arbitrary directed graph with edge and node labels, we present a new flow-based ...
Krishna Prasad Chitrapura, Srinivas R. Kashyap
ICSOC
2009
Springer
14 years 4 months ago
Web Service Search on Large Scale
The Web is nowadays moving from a Web of data to a Web of services. In this paper we present our approach for Web Service discovery on Web scale, targeted to support flexible and ...
Nathalie Steinmetz, Holger Lausen, Manuel Brunner
ACSW
2004
13 years 11 months ago
Discovering Parallel Text from the World Wide Web
Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and mul...
Jisong Chen, Rowena Chau, Chung-Hsing Yeh
IDEAL
2004
Springer
14 years 3 months ago
An Intelligent Topic-Specific Crawler Using Degree of Relevance
It is indispensable that the users surfing on the Internet could have web pages classified into a given topic as correct as possible. Toward this ends, this paper presents a topic-...
Sanguk Noh, Youngsoo Choi, Haesung Seo, Kyunghee C...