Discovering the correct dataset efficiently is critical for computations and effective simulations in scientific experiments. In contrast to searching web documents over the Intern...
Sangmi Lee Pallickara, Shrideep Pallickara, Milija...
Abstract. The structural heterogeneity and complexity of XML repositories makes query formulation challenging for users who have little knowledge of XML. To assist its users, an XM...
Versioned textual collections are collections that retain multiple versions of a document as it evolves over time. Important large-scale examples are Wikipedia and the web collect...
Geography Markup Language (GML) is an XML-based language for the markup, storage, and exchange of geospatial data. It provides a rich geospatial vocabulary and allows flexible doc...
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...