Classification is one of the basic tasks of data mining in modern database applications including molecular biology, astronomy, mechanical engineering, medical imaging or meteorolo...
In this study we propose sketching algorithms for computing similarities between hierarchical data. Specifically, we look at data objects that are represented using leaf-labeled t...
Search engines of main-stream literature digital libraries such as ACM Digital Library, Google Scholar, and PubMed employ file-based systems, and provide users with a basic boolean...
John Chmura, Nattakarn Ratprasartporn, Gultekin &O...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
Structured query language (SQL) is a classical way to access relational databases. Although SQL is powerful to query relational databases, it is rather hard for inexperienced user...
Guoliang Li, Ju Fan, Hao Wu, Jiannan Wang, Jianhua...