In plenty of scenarios, data can be represented as vectors mathematically abstracted as points in a Euclidean space. Because a great number of machine learning and data mining app...
Active learning (AL) is an increasingly popular strategy for mitigating the amount of labeled data required to train classifiers, thereby reducing annotator effort. We describe ...
Byron C. Wallace, Kevin Small, Carla E. Brodley, T...
Defining the boundaries of a web-site, for (say) archiving or information retrieval purposes, is an important but complicated task. In this paper a web-page clustering approach to...
We present rminer, our open source library for the R tool that facilitates the use of data mining (DM) algorithms, such as neural Networks (NNs) and support vector machines (SVMs),...
Malware detection is an important problem today. New malware appears every day and in order to be able to detect it, it is important to recognize families of existing malware. Dat...
The popularity of computer networks broadens the scope for network attackers and increases the damage these attacks can cause. In this context, Intrusion Detection Systems (IDS) a...
Supervised text categorization is a machine learning task where a predefined category label is automatically assigned to a previously unlabelled document based upon characteristic...