The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
DelosDLMS is a prototype of a next-generation Digital Library (DL) management system. It is the result of integrating various specialized DL services provided by partners of the D...
Social tagging is becoming increasingly popular in many Web 2.0 applications where users can annotate resources (e.g. Web pages) with arbitrary keywords (i.e. tags). A tag recomme...
Ziyu Guan, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can ...
Background: We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document ...
Music information retrieval (MIR) holds great promise as a technology for managing large music archives. One of the key components of MIR that has been actively researched into is...
Jialie Shen, Wang Meng, Shuichang Yan, HweeHwa Pan...