Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
—We propose a novel scenario called “writer retrieval” consisting in the retrieval from a set of documents all those produced by the same writer. The retrieval is based on a ...
Among various document clustering algorithms that have been proposed so far, the most useful are those that automatically reveal the number of clusters and assign each target docum...
Eugene Levner, David Pinto, Paolo Rosso, David Alc...
HyPursuit is a new hierarchical network search engine that clusters hypertext documents to structure a given information space for browsing and search activities. Our content-link...
In this paper, we will compare and evaluate the effectiveness of different statistical methods in the task of cross-document coreference resolution. We created entity models for d...