Large scale distributed real time and embedded (DRE) applications are complex entities that are often composed of different subsystems and have stringent Quality of Service (QoS) r...
Praveen Kaushik Sharma, Joseph P. Loyall, George T...
We consider the problem of finding highly correlated pairs in a large data set. That is, given a threshold not too small, we wish to report all the pairs of items (or binary attri...
Comprehensive data analysis has become indispensable in a variety of environments. Standard OLAP (On-Line Analytical Processing) systems, designed for satisfying the reporting need...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Abstract. Data mining is an iterative process. Users issue series of similar data mining queries, in each consecutive run slightly modifying either the definition of the mined dat...
Mikolaj Morzy, Tadeusz Morzy, Marek Wojciechowski,...