The Web of Linked Data grows rapidly and already contains data originating from hundreds of data sources. The quality of data from those sources is very diverse, as values may be ...
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
It is well known that different types of social ties have essentially different influence between people. However, users in online social networks rarely categorize their contact...
Selection of genes that are differentially expressed and critical to a particular biological process has been a major challenge in post-array analysis. Recent development in bioin...
Zheng Zhao, Jiangxin Wang, Huan Liu, Jieping Ye, Y...
Commercial datasets are often large, relational, and dynamic. They contain many records of people, places, things, events and their interactions over time. Such datasets are rarel...
Andrew Fast, Lisa Friedland, Marc Maier, Brian Tay...