Sciweavers

416 search results - page 21 / 84
» Structured Web Pages Management for Efficient Data Retrieval
Sort
View
WWW
2009
ACM
14 years 8 months ago
Graph based crawler seed selection
This paper identifies and explores the problem of seed selection in a web-scale crawler. We argue that seed selection is not a trivial but very important problem. Selecting proper...
Shuyi Zheng, Pavel Dmitriev, C. Lee Giles
WWW
2006
ACM
14 years 8 months ago
GoGetIt!: a tool for generating structure-driven web crawlers
We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
WWW
2008
ACM
14 years 8 months ago
User oriented link function classification
Currently most link-related applications treat all links in the same web page to be identical. One link-related application usually requires one certain property of hyperlinks but...
Mingliang Zhu, Weiming Hu, Ou Wu, Xi Li, Xiaoqin Z...
DASFAA
2010
IEEE
425views Database» more  DASFAA 2010»
13 years 12 months ago
FlexTable: Using a Dynamic Relation Model to Store RDF Data
Efficient management of RDF data is an important factor in realizing the Semantic Web vision. The existing approaches store RDF data based on triples instead of a relation model. I...
Yan Wang, Xiaoyong Du, Jiaheng Lu, Xiaofang Wang
CIKM
2008
Springer
13 years 9 months ago
Efficient and effective link analysis with precomputed salsa maps
SALSA is a link-based ranking algorithm that takes the result set of a query as input, extends the set to include additional neighboring documents in the web graph, and performs a...
Marc Najork, Nick Craswell