Sciweavers

2337 search results - page 22 / 468
» Extracting Sequences from the Web
Sort
View
SIGMOD
2003
ACM
190views Database» more  SIGMOD 2003»
14 years 2 months ago
Extracting Structured Data from Web Pages
Many web sites contain large sets of pages generated using a common template or layout. For example, Amazon lays out the author, title, comments, etc. in the same way in all its b...
Arvind Arasu, Hector Garcia-Molina
RIAO
1997
13 years 10 months ago
Coupling information retrieval and information extraction: A new text technology for gathering information from the web
The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how...
Robert J. Gaizauskas, Alexander M. Robertson
WWW
2006
ACM
14 years 9 months ago
POLYPHONET: an advanced social network extraction system from the web
Social networks play important roles in the Semantic Web: knowledge management, information retrieval, ubiquitous computing, and so on. We propose a social network extraction syst...
Hideaki Takeda, Junichiro Mori, Kôiti Hasida...
SIGIR
2005
ACM
14 years 2 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...
LREC
2008
160views Education» more  LREC 2008»
13 years 10 months ago
Automatic Extraction of Textual Elements from News Web Pages
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany