Sciweavers

502 search results - page 7 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
WEBNET
1998
13 years 9 months ago
Categorisation by Context
Assistance in retrieving of documents on the World Wide Web is provided either by search engines, through keyword based queries, or by catalogues, which organise documents into hi...
Giuseppe Attardi, Sergio Di Marco, Davide Salvi
DEXAW
2008
IEEE
123views Database» more  DEXAW 2008»
14 years 2 months ago
Text Extraction from the Web via Text-to-Tag Ratio
– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...
Tim Weninger, William H. Hsu
COOPIS
1999
IEEE
14 years 22 days ago
Looking at the Web through XML Glasses
The Web so far has been incredibly successful at delivering information to human users. So successful actually, that there is now an urgent need to go beyond a browsing human and ...
Arnaud Sahuguet, Fabien Azavant
ICML
2002
IEEE
14 years 9 months ago
Kernels for Semi-Structured Data
Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured d...
Hisashi Kashima, Teruo Koyanagi
FLAIRS
2007
13 years 10 months ago
Contextual Concept Discovery Algorithm
In this paper, we focus on the ontological concept extraction and evaluation process from HTML documents. In order to improve this process, we propose an unsupervised hierarchical...
Lobna Karoui, Marie-Aude Aufaure, Nacéra Be...