Sciweavers

174 search results - page 24 / 35
» Unsupervised Feature Selection for Text Data
Sort
View
DATAMINE
2006
224views more  DATAMINE 2006»
13 years 7 months ago
Characteristic-Based Clustering for Time Series Data
With the growing importance of time series clustering research, particularly for similarity searches amongst long time series such as those arising in medicine or finance, it is cr...
Xiaozhe Wang, Kate A. Smith, Rob J. Hyndman
ICML
2008
IEEE
14 years 8 months ago
Expectation-maximization for sparse and non-negative PCA
We study the problem of finding the dominant eigenvector of the sample covariance matrix, under additional constraints on the vector: a cardinality constraint limits the number of...
Christian D. Sigg, Joachim M. Buhmann
CLEF
2010
Springer
13 years 8 months ago
ZOT! to Wikipedia Vandalism - Lab Report for PAN at CLEF 2010
Abstract This vandalism detector uses features primarily derived from a wordpreserving differencing of the text for each Wikipedia article from before and after the edit, along wit...
James White, Rebecca Maessen
KDD
2007
ACM
276views Data Mining» more  KDD 2007»
14 years 8 months ago
Nonlinear adaptive distance metric learning for clustering
A good distance metric is crucial for many data mining tasks. To learn a metric in the unsupervised setting, most metric learning algorithms project observed data to a lowdimensio...
Jianhui Chen, Zheng Zhao, Jieping Ye, Huan Liu
EMNLP
2010
13 years 5 months ago
A New Approach to Lexical Disambiguation of Arabic Text
We describe a model for the lexical analysis of Arabic text, using the lists of alternatives supplied by a broad-coverage morphological analyzer, SAMA, which include stable lemma ...
Rushin Shah, Paramveer S. Dhillon, Mark Liberman, ...