Imbalanced class problems appear in many real applications of classification learning. We propose a novel sampling method to improve bagging for data sets with skewed class distri...
The recent years have witnessed a surge of interests of semi-supervised clustering methods, which aim to cluster the data set under the guidance of some supervisory information. U...
This article presents a method for the calculating similarity of two trajectories. The method is especially designed for a situation where the points of the trajectories are distr...
This paper proposes a methodology to generate artificial data sets to evaluate the behavior of machine learning techniques. The methodology relies in the definition of a domain an...
Joaquin Rios-Boutin, Albert Orriols-Puig, Josep Ma...
In this work, we propose and compare several methods for the visualization and exploration of time-varying volumetric medical images based on the temporal characteristics of the d...
Clustering algorithms such as k-means, the self-organizing map (SOM), or Neural Gas (NG) constitute popular tools for automated information analysis. Since data sets are becoming l...
The problem of missing data is ubiquitous in domains such as biomedical signal processing, network traffic analysis, bibliometrics, social network analysis, chemometrics, computer...
Evrim Acar, Daniel M. Dunlavy, Tamara G. Kolda, Mo...
There is growing interest in applying Bayesian techniques to NLP problems. There are a number of different estimators for Bayesian models, and it is useful to know what kinds of t...
Support Vector Machines (SVM) have gained profound interest amidst the researchers. One of the important issues concerning SVM is with its application to large data sets. It is rec...
The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in...