This paper introduces a new technique of document clustering based on frequent senses. The proposed system, GDClust (Graph-Based Document Clustering) works with frequent senses ra...
In an era that, searching the WWW for information becomes a tedious task, it is obvious that mainly search engines and other data mining mechanisms need to be enhanced with charact...
Abstract. API error-handling specifications are often not documented, necessitating automated specification mining. Automated mining of error-handling specifications is challenging...
In this paper, we describe a system by which the multilingual characteristics of Wikipedia can be utilized to annotate a large corpus of text with Named Entity Recognition (NER) t...
Abstract. One major goal of text mining is to provide automatic methods to help humans grasp the key ideas in ever-increasing text corpora. To this effect, we propose a statistica...