In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized sys...
Ricardo A. Baeza-Yates, Carlos Castillo, Flavio Ju...
Poor quality data is prevalent in databases due to a variety of reasons, including transcription errors, lack of standards for recording database fields, etc. To be able to query ...
Byung-Won On, Nick Koudas, Dongwon Lee, Divesh Sri...
This paper addresses the problem of efficient maintenance of a materialized skyline view in response to skyline removals. While there has been significant progress on skyline quer...
Queries containing outer joins are common in data warehousing applications. Materialized outer-join views could greatly speed up many such queries but most database systems do not...
Database performance can be greatly affected by environmental and internal dynamics such as workloads and system configurations. Existing strategies to maintain performance under ...
Yi-Cheng Tu, Song Liu, Sunil Prabhakar, Bin Yao, W...
Modeling objects by probability density functions (pdf) is a new powerful method to represent complex objects in databases. By representing an object as a pdf, e.g. a Gaussian, it...
Uncertainty in categorical data is commonplace in many applications, including data cleaning, database integration, and biological annotation. In such domains, the correct value o...
Sarvjeet Singh, Chris Mayfield, Sunil Prabhakar, R...
A k-nearest neighbor (k-NN) query retrieves k objects from a database that are considered to be the closest to a given query point. Numerous techniques have been proposed in the p...