Sciweavers

KDD
2008
ACM
202views Data Mining» more  KDD 2008»
15 years 25 days ago
Data and Structural k-Anonymity in Social Networks
The advent of social network sites in the last years seems to be a trend that will likely continue. What naive technology users may not realize is that the information they provide...
Alina Campan, Traian Marius Truta
KDD
2008
ACM
175views Data Mining» more  KDD 2008»
15 years 25 days ago
Geocode Matching and Privacy Preservation
Geocoding is the process of matching addresses to geographic locations, such as latitudes and longitudes, or local census areas. In many applications, addresses are the key to geo-...
Peter Christen
KDD
2008
ACM
264views Data Mining» more  KDD 2008»
15 years 25 days ago
Stable feature selection via dense feature groups
Many feature selection algorithms have been proposed in the past focusing on improving classification accuracy. In this work, we point out the importance of stable feature selecti...
Lei Yu, Chris H. Q. Ding, Steven Loscalzo
KDD
2008
ACM
181views Data Mining» more  KDD 2008»
15 years 25 days ago
Fastanova: an efficient algorithm for genome-wide association study
Studying the association between quantitative phenotype (such as height or weight) and single nucleotide polymorphisms (SNPs) is an important problem in biology. To understand und...
Xiang Zhang, Fei Zou, Wei Wang 0010
KDD
2008
ACM
195views Data Mining» more  KDD 2008»
15 years 25 days ago
Anomaly pattern detection in categorical datasets
We propose a new method for detecting patterns of anomalies in categorical datasets. We assume that anomalies are generated by some underlying process which affects only a particu...
Kaustav Das, Jeff G. Schneider, Daniel B. Neill
KDD
2008
ACM
156views Data Mining» more  KDD 2008»
15 years 25 days ago
Unsupervised deduplication using cross-field dependencies
Recent work in deduplication has shown that collective deduplication of different attribute types can improve performance. But although these techniques cluster the attributes col...
Robert Hall, Charles A. Sutton, Andrew McCallum
KDD
2008
ACM
209views Data Mining» more  KDD 2008»
15 years 25 days ago
Combinational collaborative filtering for personalized community recommendation
Rapid growth in the amount of data available on social networking sites has made information retrieval increasingly challenging for users. In this paper, we propose a collaborativ...
WenYen Chen, Dong Zhang, Edward Y. Chang
KDD
2008
ACM
274views Data Mining» more  KDD 2008»
15 years 25 days ago
Data mining using high performance data clouds: experimental studies using sector and sphere
We describe the design and implementation of a high performance cloud that we have used to archive, analyze and mine large distributed data sets. By a cloud, we mean an infrastruc...
Robert L. Grossman, Yunhong Gu
KDD
2008
ACM
140views Data Mining» more  KDD 2008»
15 years 25 days ago
On updates that constrain the features' connections during learning
In many multiclass learning scenarios, the number of classes is relatively large (thousands,...), or the space and time efficiency of the learning system can be crucial. We invest...
Omid Madani, Jian Huang 0002
KDD
2008
ACM
135views Data Mining» more  KDD 2008»
15 years 25 days ago
Feedback effects between similarity and social influence in online communities
David J. Crandall, Dan Cosley, Daniel P. Huttenloc...