Abstract. Requirements engineering, the first phase of any software development project, is the Achilles’ heel of the whole development process, as requirements documents are of...
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. Few users wish to retri...
Genre, like layout, is an important factor in effective communication, and automated tools which assist in genre compliance are thus of considerable value. Genres are reusable met...
Marc Nanard, Jocelyne Nanard, Peter R. King, Ludov...
Query-independent features (also called document priors), such as the number of incoming links to a document, its Page-Rank, or the type of its associated URL, have been successfu...
In this paper, we propose a document clustering method that strives to achieve: (1) a high accuracy of document clustering, and (2) the capability of estimating the number of clus...