Sciweavers

684 search results - page 29 / 137
» Extracting semantic structure of web documents using content...
Sort
View
WWW
2001
ACM
14 years 9 months ago
Towards second and third generation web-based multimedia
First generation Web-content encodes information in handwritten (HTML) Web pages. Second generation Web content generates HTML pages on demand, e.g. by filling in templates with c...
Jacco van Ossenbruggen, Joost Geurts, Frank Cornel...
ADBIS
2003
Springer
127views Database» more  ADBIS 2003»
14 years 1 months ago
Using Common Schemas for Information Extraction from Heterogeneous Web Catalogs
The Web has become the world’s largest information source. Unfortunately, the main success factor of the Web, the inherent principle of distribution and autonomy of the participa...
Richard Vlach, Wassili Kazakos
WISE
2005
Springer
14 years 2 months ago
Semantic Partitioning of Web Pages
In this paper we describe the semantic partitioner algorithm, that uses the structural and presentation regularities of the Web pages to automatically transform them into hierarchi...
Srinivas Vadrevu, Fatih Gelgi, Hasan Davulcu
ICUIMC
2009
ACM
14 years 3 months ago
PicAChoo: a tool for customizable feature extraction utilizing characteristics of textual data
Although documents have hundreds of thousands of unique words, only a small number of words are significantly useful for intelligent services. For this reason, feature extraction ...
Jaeseok Myung, Jung-Yeon Yang, Sang-goo Lee
JCDL
2006
ACM
237views Education» more  JCDL 2006»
14 years 2 months ago
Automatic extraction of table metadata from digital documents
Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and high...
Ying Liu, Prasenjit Mitra, C. Lee Giles, Kun Bai