Huge amounts of data are stored in autonomous, geographically distributed sources. The discovery of previously unknown, implicit and valuable knowledge is a key aspect of the expl...
We introduce the data model BM, which specifies kernels of motifs by means of Boolean matrices. Different from position frequency matrices these only specify which bases can appea...
We propose to solve a text categorization task using a new metric between documents, based on a priori semantic knowledge about words. This metric can be incorporated into the def...
Quantiles, also known as value-at-risk in financial applications, are important measures of random performance. Quantile sensitivities provide information on how changes in the i...
In this paper we propose a domainindependent text segmentation method, which consists of three components. Latent Dirichlet allocation (LDA) is employed to compute words semantic ...