Column-oriented database system architectures invite a reevaluation of how and when data in databases is compressed. Storing data in a column-oriented fashion greatly increases th...
Numerous approaches, including textual, structural and featural, to detecting duplicate documents have been investigated. Considering document images are usually stored and transm...
Practical clustering algorithms require multiple data scans to achieve convergence. For large databases, these scans become prohibitively expensive. We present a scalable clusteri...
Mining informative patterns from very large, dynamically changing databases poses numerous interesting challenges. Data summarizations (e.g., data bubbles) have been proposed to c...
Monitoring movement of high-dimensional points is essential for environmental databases, geospatial applications, and biodiversity informatics as it reveals crucial information ab...
Michalis Potamias, Kostas Patroumpas, Timos K. Sel...