— Category Ranking is a variant of the multi-label classification problem, in which, rather than performing a (hard) assignment to an object of categories from a predefined set...
Many content-oriented applications require a scalable text index. Building such an index is challenging. In addition to the logic of inserting and searching documents, developers ...
In this paper, we present an analysis based on linguistic and typographic features that allows for the identification of titles in web documents. We focus in particular on procedu...
Stemming algorithms find canonical forms for inflected words, e. g. for declined nouns or conjugated verbs. Since such a unification of words with respect to gender, number, time, ...
In this paper, we introduce the idea of Intent Analysis, which is to create a profile of the goals and intentions present in textual content. Intent Analysis, similar to Sentiment...