: Patent classification is a large scale hierarchical text classification (LSHTC) task. Though comprehensive comparisons, either learning algorithms or feature selection strategies...
PKIP, Patterned Keywords in Phrase, is our feature selection approach to text categorization (TC) for item banks. An item bank is a collection of textual data in which each item c...
Atorn Nuntiyagul, Nick Cercone, Kanlaya Naruedomku...
Abstract. Bag-of-words model (BOW) is inspired by the text classification problem, where a document is represented by an unsorted set of contained words. Analogously, in the objec...
Mehdi Mirza-Mohammadi, Sergio Escalera, Petia Rade...
This paper puts forward a hierarchical approach for categorizing emails with the ME model based on its contents and properties. This approach categorizes emails in a two-phase way...
Abstract. This paper proposes the use of Latent Semantic Indexing (LSI) techniques, decomposed with semi-discrete matrix decomposition (SDD) method, for text categorization. The SD...