Sciweavers

708 search results - page 17 / 142
» Identifying Content Blocks from Web Documents
Sort
View
SIGDOC
2004
ACM
14 years 2 months ago
Semantic thumbnails: a novel method for summarizing document collections
The concept of thumbnails is common in image representation. A thumbnail is a highly compressed version of an image that provides a small, yet complete visual representation to th...
Arijit Sengupta, Mehmet M. Dalkilic, James C. Cost...
CDVE
2006
Springer
130views Visualization» more  CDVE 2006»
14 years 10 days ago
Flexible Collaboration over XML Documents
Abstract. XML documents are increasingly being used to mark up various kinds of data from web content to scientific data. Often these documents need to be collaboratively created a...
Claudia-Lavinia Ignat, Moira C. Norrie
CIKM
2003
Springer
14 years 1 months ago
Extracting unstructured data from template generated web documents
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
Ling Ma, Nazli Goharian, Abdur Chowdhury, Misun Ch...
AUSAI
2003
Springer
14 years 1 months ago
Semi-Automatic Construction of Metadata from a Series of Web Documents
Metadata plays an important role in discovering, collecting, extracting and aggregating Web data. This paper proposes a method of constructing metadata for a specific topic. The m...
Sachio Hirokawa, Eisuke Itoh, Tetsuhiro Miyahara
AUSDM
2006
Springer
160views Data Mining» more  AUSDM 2006»
14 years 10 days ago
Extraction of Flat and Nested Data Records from Web Pages
This paper deals with studies the problem of identification and extraction of flat and nested data records from a given web page. With the explosive growth of information sources ...
Siddu P. Algur, P. S. Hiremath