Data mining algorithms use various Trie and bitmap-based representations to optimize the support (i.e., frequency) counting performance. In this paper, we compare the memory requi...
Distance function computation is a key subtask in many data mining algorithms and applications. The most effective form of the distance function can only be expressed in the conte...
Skewed distributions appear very often in practice. Unfortunately, the traditional Zipf distribution often fails to model them well. In this paper, we propose a new probability di...
Modern applications such as Internet traffic, telecommunication records, and large-scale social networks generate massive amounts of data with multiple aspects and high dimensiona...
We are developing a technique to predict travel time of a vehicle for an objective road section, based on real time traffic data collected through a probe-car system. In the area ...