In this paper we introduce a new M-tree building method, utilizing the classic idea of forced reinsertions. In case a leaf is about to split, some distant objects are removed from...
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
Corporate fraud these days represents a huge cost to our economy. Academic literature already concentrated on how data mining techniques can be of value in the fight against frau...
We describe DEIMOS, a system that automatically discovers and models new sources of information. The system exploits four core technologies developed by our group that makes an en...
Clustering is an active research topic in data mining and different methods have been proposed in the literature. Most of these methods are based on the use of a distance measure ...
We present a demo of ESTER, a search engine that combines the ease of use, speed and scalability of full-text search with the powerful semantic capabilities of ontologies. ESTER s...
Several marketing problems involve prediction of customer purchase behavior and forecasting future preferences. We consider predictive modeling of large scale, bi-modal or multimo...
An Association Rule (AR) is a common knowledge model in data mining that describes an implicative cooccurring relationship between two disjoint sets of binary-valued transaction d...
Yanbo J. Wang, Xinwei Zheng, Frans Coenen, Cindy Y...
Word meaning disambiguation has always been an important problem in many computer science tasks, such as information retrieval and extraction. One of the problems, faced in automa...