Sciweavers

422 search results - page 79 / 85
» Beyond the Web: Retrieval in Social Information Spaces
Sort
View
WWW
2007
ACM
14 years 8 months ago
GigaHash: scalable minimal perfect hashing for billions of urls
A minimal perfect function maps a static set of keys on to the range of integers {0,1,2, ... , - 1}. We present a scalable high performance algorithm based on random graphs for ...
Kumar Chellapilla, Anton Mityagin, Denis Xavier Ch...
ICDE
2007
IEEE
99views Database» more  ICDE 2007»
14 years 8 months ago
Source-aware Entity Matching: A Compositional Approach
Entity matching (a.k.a. record linkage) plays a crucial role in integrating multiple data sources, and numerous matching solutions have been developed. However, the solutions have...
Warren Shen, Pedro DeRose, Long Vu, AnHai Doan, Ra...
WWW
2009
ACM
14 years 8 months ago
Collaborative filtering for orkut communities: discovery of user latent behavior
Users of social networking services can connect with each other by forming communities for online interaction. Yet as the number of communities hosted by such websites grows over ...
WenYen Chen, Jon-Chyuan Chu, Junyi Luan, Hongjie B...
CIKM
2011
Springer
12 years 7 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
CICLING
2005
Springer
14 years 1 months ago
The UNL Initiative: An Overview
We are presenting a description of the UNL initiative based on the Universal Networking Language (UNL). This language was conceived to be the support of the multilingual communicat...
Igor Boguslavsky, Jesús Cardeñosa, C...