Sciweavers

684 search results - page 36 / 137
» Extracting semantic structure of web documents using content...
Sort
View
WWW
2008
ACM
14 years 9 months ago
Visualizing historical content of web pages
Recently, along with the rapid growth of the Web, the preservation efforts have also increased. As a consequence, large amounts of past Web data are stored in Web archives. This h...
Adam Jatowt, Yukiko Kawai, Katsumi Tanaka
ICDAR
2005
IEEE
14 years 2 months ago
Enhancement of Layout-based Identification of Low-resolution Documents using Geometrical Color Distribution
This paper proposes a multi-signature document identification method that works robustly with lowresolution documents captured from handheld devices. The proposed method is based ...
Ardhendu Behera, Denis Lalanne, Rolf Ingold
ICEIS
2009
IEEE
14 years 3 months ago
Semi-supervised Information Extraction from Variable-length Web-page Lists
We propose two methods for constructing automated programs for extraction of information from a class of web pages that are very common and of high practical significance - varia...
Daniel Nikovski, Alan Esenther, Akihiro Baba
ELPUB
2007
ACM
14 years 19 days ago
Digitisation and Access to Archival Collections: A Case Study of the Sofia Municipal Government (1878-1879)
The paper presents in brief a project aimed at the development of a methodology and corresponding software tools intended for building of proper environments giving up means for s...
Maria Nisheva-Pavlova, Pavel Pavlov, Nikolay Marko...
AUSAI
2003
Springer
14 years 1 months ago
Semi-Automatic Construction of Metadata from a Series of Web Documents
Metadata plays an important role in discovering, collecting, extracting and aggregating Web data. This paper proposes a method of constructing metadata for a specific topic. The m...
Sachio Hirokawa, Eisuke Itoh, Tetsuhiro Miyahara