This paper describes an unsupervised algorithm for segmenting categorical time series. The algorithm first collects statistics about the frequency and boundary entropy of ngrams, t...
The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
Extensive experimental evidence is required to study the impact of text categorization approaches on real data and to assess the performance within operational scenarios. In this ...
Roberto Basili, Alessandro Moschitti, Maria Teresa...
Due to the exponential growth of documents on the Internet and the emergent need to organize them, the automated categorization of documents into predefined labels has received an...
There is an ever increasing number of electronic documents available today and the task of organizing and categorizing this ever growing corpus of electronic documents has become t...