Abstract. Efficient nearest neighbor (NN) search techniques for highdimensional data are crucial to content-based image retrieval (CBIR). Traditional data structures (e.g., kd-tree...
Pengcheng Wu, Steven C. H. Hoi, Duc Dung Nguyen, Y...
We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed emp...
Advances in data collection technologies allow accumulation of large and high dimensional datasets and provide opportunities for learning high quality classification and regression...
Despite popular belief, boosting algorithms and related coordinate descent methods are prone to overfitting. We derive modifications to AdaBoost and related gradient-based coordin...
Background: Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that a...
Malik Yousef, Segun Jung, Louise C. Showe, Michael...