In this paper we explore database segmentation in the context of a column-store DBMS targeted at a scientific database. We present a novel hardware- and scheme-oblivious segmentati...
Feature weighting or selection is a crucial process to identify an important subset of features from a data set. Removing irrelevant or redundant features can improve the generali...
Dimensionality reduction plays an important role in many data mining applications involving high-dimensional data. Many existing dimensionality reduction techniques can be formula...
Cloud storage is an emerging infrastructure that offers Platforms as a Service (PaaS). On such platforms, storage and compute power are adjusted dynamically, and therefore it is i...
Social media such as blogs, Facebook, Flickr, etc., presents data in a network format rather than classical IID distribution. To address the interdependency among data instances, ...