The k-means algorithm with cosine similarity, also known as the spherical k-means algorithm, is a popular method for clustering document collections. However, spherical k-means ca...
Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify te...
Extractive text summarization aims to create a condensed version of one or more source documents by selecting the most informative sentences. Research in text summarization has th...
Abstract. This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We first create a concept-based document representation b...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...
Background: The development of text mining systems that annotate biological entities with their properties using scientific literature is an important recent research topic. These...