Sciweavers

156 search results - page 24 / 32
» The UCI KDD Archive of Large Data Sets for Data Mining Resea...
Sort
View
PAKDD
2009
ACM
225views Data Mining» more  PAKDD 2009»
14 years 6 days ago
Accurate Synthetic Generation of Realistic Personal Information
A large proportion of the massive amounts of data that are being collected by many organisations today is about people, and often contains identifying information like names, addre...
Peter Christen, Agus Pudjijono
KDD
2004
ACM
134views Data Mining» more  KDD 2004»
14 years 8 months ago
Exploiting a support-based upper bound of Pearson's correlation coefficient for efficiently identifying strongly correlated pair
Given a user-specified minimum correlation threshold and a market basket database with N items and T transactions, an all-strong-pairs correlation query finds all item pairs with...
Hui Xiong, Shashi Shekhar, Pang-Ning Tan, Vipin Ku...
JMLR
2010
116views more  JMLR 2010»
13 years 2 months ago
Feature Selection, Association Rules Network and Theory Building
As the size and dimensionality of data sets increase, the task of feature selection has become increasingly important. In this paper we demonstrate how association rules can be us...
Sanjay Chawla
CIKM
2004
Springer
14 years 1 months ago
Optimizing web search using web click-through data
The performance of web search engines may often deteriorate due to the diversity and noisy information contained within web pages. User click-through data can be used to introduce...
Gui-Rong Xue, Hua-Jun Zeng, Zheng Chen, Yong Yu, W...
SAC
2006
ACM
14 years 1 months ago
The impact of sample reduction on PCA-based feature extraction for supervised learning
“The curse of dimensionality” is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and classification error in high dimension...
Mykola Pechenizkiy, Seppo Puuronen, Alexey Tsymbal