Instance selection and feature selection are two orthogonal methods for reducing the amount and complexity of data. Feature selection aims at the reduction of redundant features i...
— In the present paper, we consider the automatic text categorization as a series of information processing and propose a new classification technique called the Frequency Ratio ...
Today, two classes of indexing methods enjoying wide applicability are the Inverted Index and the Superimposed Coding based Signature File (SC-SF). The former is most efficient i...
Dimitrios Dervos, P. Linardis, Yannis Manolopoulos
Web Page segmentation is a crucial step for many applications in Information Retrieval, such as text classification, de-duplication and full-text search. In this paper we describe...
In this paper we present a novel strategy, DragPushing, for improving the performance of text classifiers. The strategy is generic and takes advantage of training errors to succes...
Songbo Tan, Xueqi Cheng, Moustafa Ghanem, Bin Wang...