Disambiguating person names in a set of documents (such as a set of web pages returned in response to a person name) is a key task for the presentation of results and the automatic...
Abstract—Bipartite graphs are often used to illustrate relationships between two sets of data, such as web pages and visitors. At the same time, information is often organized hi...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Gambal is an information retrieval system for indexing and accessing web pages that includes graphical interfaces to ease web page search and accessing. In particular, the interfa...
Document clustering techniques have been applied in several areas, with the web as one of the most recent and influent. Both general-purpose and text-oriented techniques exist and...