Sciweavers

563 search results - page 52 / 113
» Crawling the web for structured documents
Sort
View
KDD
2007
ACM
155views Data Mining» more  KDD 2007»
16 years 4 months ago
Mining templates from search result records of search engines
Metasearch engine, Comparison-shopping and Deep Web crawling applications need to extract search result records enwrapped in result pages returned from search engines in response ...
Hongkun Zhao, Weiyi Meng, Clement T. Yu
ICDM
2006
IEEE
164views Data Mining» more  ICDM 2006»
15 years 10 months ago
Unsupervised Learning of Tree Alignment Models for Information Extraction
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
Philip Zigoris, Damian Eads, Yi Zhang
140
Voted
ASWEC
2004
IEEE
15 years 7 months ago
UML Documentation Support for XML Schema
With the proliferation of XML as the lingua franca of internet information exchange, engineering XML documents and maintaining their databases becomes a major challenge. In this c...
Flora Dilys Salim, Rosanne Price, Shonali Krishnas...
WWW
2005
ACM
16 years 4 months ago
Three-level caching for efficient query processing in large Web search engines
Large web search engines have to answer thousands of queries per second with interactive response times. Due to the sizes of the data sets involved, often in the range of multiple...
Xiaohui Long, Torsten Suel
153
Voted
ECAI
2004
Springer
15 years 7 months ago
OntoRefiner, a user query refinement interface usable for Semantic Web Portals
We present a user interface, the OntoRefiner1 system, for helping the user to navigate numerous retrieved documents after a search querying a semantic portal which integrates a ver...
Brigitte Safar, Hassen Kefi, Chantal Reynaud