One of the biggest obstacles faced by user command based anomaly detection techniques is the paucity of data. Gathering command data is a slow process often spanning months or yea...
Two dimensional point data can be considered one of the most basic, yet one of the most ubiquitous data types arising in a wide variety of applications. The basic scatter plot app...
Tatiana von Landesberger, Sebastian Bremm, Peyman ...
Abstract. Medical reports are predominantly written in natural language; as such they are not computer-accessible. A common way to make medical narrative accessible to automated sy...
Janneke van der Zwaan, Erik F. Tjong Kim Sang, Maa...
Abstract. We consider three paradigms of computation where the bene ts of a parallel solution are greater than usual. Paradigm 1 works on a time-varying input data set, whose size ...
In this paper, we describe a query system that provides visual relevance feedback in querying large databases. Our goal is to support the process of data mining by representing as...
The operation of a computer can be conceptualised as a large, discrete and constantly changing set of state information. However even for the simplest uni-processor the data set o...
Paul S. Coe, Laurence M. Williams, Roland N. Ibbet...
This paper presents an efficient algorithm for learning Bayesian belief networks from databases. The algorithm takes a database as input and constructs the belief network structur...
We consider the problem of indexing general database workloads (combinations of data sets and sets of potential queries). We dene a framework for measuring the eciency of an ind...
Joseph M. Hellerstein, Elias Koutsoupias, Christos...
Partitioning a large set of objects into homogeneous clusters is a fundamental operation in data mining. The k-means algorithm is best suited for implementing this operation becau...
In recent years scientific visualization has been driven by the need to visualize high-dimensional data sets within high-dimensional spaces. However most visualization methods ar...