Abstract--Imbalanced data sets present a particular challenge to the data mining community. Often, it is the rare event that is of interest and the cost of misclassifying the rare ...
Document clustering has been used for better document retrieval, document browsing, and text mining in digital library. In this paper, we perform a comprehensive comparison study ...
We present a system towards the integration of data mining into relational databases. To this end, a relational database model is proposed, based on the so called virtual mining vi...
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Discriminative sequential learning models like Conditional Random Fields (CRFs) have achieved significant success in several areas such as natural language processing, information...
Xuan Hieu Phan, Minh Le Nguyen, Tu Bao Ho, Susumu ...