Sciweavers

2337 search results - page 13 / 468
» Extracting Sequences from the Web
Sort
View
DEXAW
2008
IEEE
123views Database» more  DEXAW 2008»
14 years 3 months ago
Text Extraction from the Web via Text-to-Tag Ratio
– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...
Tim Weninger, William H. Hsu
ISMIS
2003
Springer
14 years 2 months ago
MetaNews: An Information Agent for Gathering News Articles on the Web
This paper presents MetaNews, an information gathering agent for news articles on the Web. MetaNews reads HTML documents from online news sites and extracts article information fro...
Dae-Ki Kang, Joongmin Choi
KES
2008
Springer
13 years 9 months ago
Data Mining for Navigation Generating System with Unorganized Web Resources
Users prefer to navigate subjects from organized topics in an abundance resources than to list pages retrieved from search engines. We propose a framework to cluster frequent items...
Diana Purwitasari, Yasuhisa Okazaki, Kenzi Watanab...
ICDE
2004
IEEE
117views Database» more  ICDE 2004»
14 years 10 months ago
Probe, Cluster, and Discover: Focused Extraction of QA-Pagelets from the Deep Web
In this paper, we introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient min...
James Caverlee, Ling Liu, David Buttler
ER
2006
Springer
149views Database» more  ER 2006»
14 years 19 days ago
Automatic Creation of Web Services from Extraction Ontologies
Abstract. The Semantic Web promises to provide timely, targeted access to user-specified information online. Though standardized services exist for performing this work, specifying...
Cui Tao, Yihong Ding, Deryle W. Lonsdale