Sciweavers

563 search results - page 43 / 113
» Crawling the web for structured documents
Sort
View
WWW
2006
ACM
16 years 4 months ago
Relaxed: on the way towards true validation of compound documents
To maintain interoperability in the Web environment it is necessary to comply with Web standards. Current specifications of HTML and XHTML languages define conformance conditions ...
Jirka Kosek, Petr Nálevka
COOPIS
1999
IEEE
15 years 8 months ago
Looking at the Web through XML Glasses
The Web so far has been incredibly successful at delivering information to human users. So successful actually, that there is now an urgent need to go beyond a browsing human and ...
Arnaud Sahuguet, Fabien Azavant
WWW
2007
ACM
16 years 4 months ago
A new suffix tree similarity measure for document clustering
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Hung Chim, Xiaotie Deng
ICDE
2010
IEEE
251views Database» more  ICDE 2010»
16 years 3 months ago
Viewing a World of Annotations through AnnoVIP
The proliferation of electronic content has notably lead to the apparition of large corpora of interrelated structured documents (such as HTML and XML Web pages) and semantic annot...
Konstantinos Karanasos, Spyros Zoupanos
EDBT
2006
ACM
102views Database» more  EDBT 2006»
16 years 4 months ago
STRIDER: A Versatile System for Structural Disambiguation
We present STRIDER1 , a versatile system for the disambiguation of structure-based information like XML schemas, structures of XML documents and web directories. The system perform...
Federica Mandreoli, Riccardo Martoglia, Enrico Ron...