This paper presents the results of classifying Arabic text documents using the N-gram frequency statistics technique employing a dissimilarity measure called the "Manhattan di...
Traditionally in industrial system development, the total project is decomposed into phases. The result from one phase, normally a document or a system component, is passed to the...
Ulf Cederling, Roland Ekinge, Bengt Lennartsson, L...
This paper presents a means of automatically deriving a hierarchical organization of concepts from a set of documents without use of training data or standard clustering technique...
E-commerce and knowledge management applications generate and consume tremendous amounts of online information that is typically available as textual documents. To facilitate subs...
Software and Information Systems (IS) documents are a common product of large IS development e orts. These documents are produced and consumed through a variety of documentation p...