Deriving a thematically meaningful partition of an unlabeled document corpus is a challenging task. In this context, the use of document representations based on latent thematic ge...
The performance of document clustering systems depends on employing optimal text representations, which are not only difficult to determine beforehand, but also may vary from one ...
Incremental hierarchical text document clustering algorithms are important in organizing documents generated from streaming on-line sources, such as, Newswire and Blogs. However, ...
We present Zeus, an environment designed to aid in the creation of a repository of digital theses. Zeus is an asynchronous cooperative toolset which allows the revision and annota...
Automatic document classification is an important step in organizing and mining documents. Information in documents is often conveyed using both text and images that complement ea...