Sciweavers

502 search results - page 5 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
INFORMATICALT
2007
164views more  INFORMATICALT 2007»
13 years 8 months ago
Extracting Personalised Ontology from Data-Intensive Web Application: an HTML Forms-Based Reverse Engineering Approach
The advance of the Web has significantly and rapidly changed the way of information organization, sharing and distribution. The next generation of the web, the semantic web, seeks...
Sidi Mohamed Benslimane, Mimoun Malki, Mustapha Ka...
WWW
2005
ACM
14 years 9 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
PVLDB
2010
135views more  PVLDB 2010»
13 years 6 months ago
SXPath - Extending XPath towards Spatial Querying on Web Documents
Querying data from presentation formats like HTML, for purposes such as information extraction, requires the consideration of tree structures as well as the consideration of spati...
Ermelinda Oro, Massimo Ruffolo, Steffen Staab
WEBI
2005
Springer
14 years 1 months ago
Automated Metadata and Instance Extraction from News Web Sites
In this paper, we present automated techniques for extracting metadata instance information by organizing and mining a set of news Web sites. We develop algorithms that detect and...
Srinivas Vadrevu, Saravanakumar Nagarajan, Fatih G...
ICTAI
1999
IEEE
14 years 21 days ago
A New Study on Using HTML Structures to Improve Retrieval
Locating useful information effectively from the World Wide Web (WWW) is of wide interest. This paper presents new results on a methodology of using the structures and hyperlinks ...
Michal Cutler, H. Deng, S. Maniccam, Weiyi Meng