With the growing importance of time series clustering research, particularly for similarity searches amongst long time series such as those arising in medicine or finance, it is cr...
We study the problem of finding the dominant eigenvector of the sample covariance matrix, under additional constraints on the vector: a cardinality constraint limits the number of...
Abstract This vandalism detector uses features primarily derived from a wordpreserving differencing of the text for each Wikipedia article from before and after the edit, along wit...
A good distance metric is crucial for many data mining tasks. To learn a metric in the unsupervised setting, most metric learning algorithms project observed data to a lowdimensio...
We describe a model for the lexical analysis of Arabic text, using the lists of alternatives supplied by a broad-coverage morphological analyzer, SAMA, which include stable lemma ...
Rushin Shah, Paramveer S. Dhillon, Mark Liberman, ...