—In this paper we present a generic approach for summarising multilingual news clusters such as the ones produced by the Europe Media Monitor (EMM) system. It is generic because ...
Mijail Alexandrov Kabadjov, Josef Steinberger, Bru...
While traditional research on text clustering has largely focused on grouping documents by topic, it is conceivable that a user may want to cluster documents along other dimension...
We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...
Feature selection is often applied to highdimensional data prior to classification learning. Using the same training dataset in both selection and learning can result in socalled ...
Although fully generative models have been successfully used to model the contents of text documents, they are often awkward to apply to combinations of text data and document met...