Sciweavers

2677 search results - page 49 / 536
» Extracting Structured Data from Web Pages
Sort
View
ICDAR
2003
IEEE
14 years 1 months ago
Automated Detection and Segmentation of Table of Contents Page from Document Images
With an aim to extract the structural information from the table of contents (TOC) to help develop digital document library the requirement of identifying/segmenting the TOC page ...
S. Mandal, S. P. Chowdhury, Amit Kumar Das, Bhabat...
AC
2006
Springer
13 years 8 months ago
Web Testing for Reliability Improvement
In this chapter, we characterize problems for web applications, examine existing testing techniques that are potentially applicable to the web environment, and introduce a strateg...
Jeff Tian, Li Ma
TKDE
2002
111views more  TKDE 2002»
13 years 8 months ago
Query Relaxation by Structure and Semantics for Retrieval of Logical Web Documents
Since WWW encourages hypertext and hypermedia document authoring (e.g. HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperl...
Wen-Syan Li, K. Selçuk Candan, Quoc Vu, Div...
DMKD
2003
ACM
114views Data Mining» more  DMKD 2003»
14 years 1 months ago
Deriving link-context from HTML tag tree
HTML anchors are often surrounded by text that seems to describe the destination page appropriately. The text surrounding a link or the link-context is used for a variety of tasks...
Gautam Pant
ACL
2007
13 years 10 months ago
PageRanking WordNet Synsets: An Application to Opinion Mining
This paper presents an application of PageRank, a random-walk model originally devised for ranking Web search results, to ranking WordNet synsets in terms of how strongly they pos...
Andrea Esuli, Fabrizio Sebastiani