XML (eXtensible Markup Language) is a linear syntax for trees, which has gathered a remarkable amount of interest in industry. The acceptance of XML opens new venues for the application of ethods such as specification of abstract syntax tree sets and tree transformations. A notation for defining a set of XML trees is called a schema language. Such trees correspond to a specific user domain, such as XHTML, the class of XML documents that make sense as HTML. A useful schema notation must: identify most of the syntactic requirements that the documents in the user domain follow; allow efficient parsing; be readable to the user; allow limited tree transformations corresponding to the insertion of defaults; be modular and extensible to support evolving classes of XML documents. In the present paper, we introduce the DSD (Document Structure Description) notation as our bid on how to meet the requirements above. The expressiveness of DSDs goes far beyond the DTD concept that is already bu...
Nils Klarlund, Anders Møller, Michael I. Sc