This paper proposes a clustering approach that explores both the content and the structure of XML documents for determining similarity among them. Assuming that the content and th...
Semantic web researchers tend to assume that XML Schema and OWL-S are the correct means for representing the types, structure, and semantics of XML data used for documents and int...
Andruid Kerne, Zachary O. Toups, Blake Dworaczyk, ...
We present new techniques for supervised wrapper generation and automated web information extraction, and a system called Lixto implementing these techniques. Our system can gener...
The current methods of publishing chemical information in bioscience articles are analysed. Using 3 papers as use-cases, it is shown that conventional methods using human procedur...
Peter Murray-Rust, John B. O. Mitchell, Henry S. R...
An increasing number of business users and software applications need to process information that is accessible via multiple diverse information systems, such as database systems,...