With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
Name ambiguity is a special case of identity uncertainty where one person can be referenced by multiple name variations in different situations or even share the same name with ot...
Yang Song, Jian Huang 0002, Isaac G. Councill, Jia...
Sequential patterns of d-gaps exist pervasively in inverted lists of Web document collection indices due to the cluster property. In this paper the information of d-gap sequential...
The increasing amount of communication between individuals in e-formats (e.g. email, Instant messaging and the Web) has motivated computational research in social network analysis...
Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, H...
The wealth of information available on the web makes it an attractive resource for seeking quick answers. While web-based question answering becomes an emerging topic in recent ye...