When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. ...
1 The latent semantic indexing (LSI) methodology for information retrieval applies the singular value decomposition to identify an eigensystem for a large matrix, in which cells re...
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail we are given a set ...
This paper presents a cluster validation based document clustering algorithm, which is capable of identifying both important feature words and true model order (cluster number). I...
For the huge amounts of audio and video material that could usefully be included in digital libraries, the cost of producing human-generated annotations and meta-data is prohibiti...
Alexander G. Hauptmann, Michael J. Witbrock, Micha...