Sciweavers

DL
2000
Springer

Re-engineering structures from Web documents

14 years 3 months ago
Re-engineering structures from Web documents
To realise a wide range of applications (including digital libraries) on the Web, a more structured way of accessing the Web is required and such requirement can be facilitated by the use of XML standard. In this paper, we propose a general framework for reverse engineering (or re-engineering) the underlying structures i.e., the DTD from a collection of similarly structured XML documents when they share some common but unknown The essential data structures and algorithms for the DTD generation have been developed and experiments on real Web collections have been conducted to demonstrate their feasibility. addition, we also proposed a method of imposing a constraint on the repetitiveness on the elements in a DTD rule to further simplify the generated DTD without compromising their correctness.
Chuang-Hue Moh, Ee-Peng Lim, Wee Keong Ng
Added 02 Aug 2010
Updated 02 Aug 2010
Type Conference
Year 2000
Where DL
Authors Chuang-Hue Moh, Ee-Peng Lim, Wee Keong Ng
Comments (0)