Large text corpora with news, customer mail and reports, or Web 2.0 contributions offer a great potential for enhancing business-intelligence applications. We propose a framework ...
Srikanta J. Bedathur, Klaus Berberich, Jens Dittri...
Dremel is a scalable, interactive ad-hoc query system for analysis of read-only nested data. By combining multi-level execution trees and columnar data layout, it is capable of ru...
Current pattern-detection proposals for streaming data recognize the need to move beyond a simple regular-expression model over strictly ordered input. We continue in this directi...
Badrish Chandramouli, Jonathan Goldstein, David Ma...
Ontology-based data access is a powerful form of extending database technology, where a classical extensional database (EDB) is enhanced by an ontology that generates new intensio...
Data ambiguity is inherent in applications such as data integration, location-based services, and sensor monitoring. In many situations, it is possible to “clean”, or remove, ...
Reynold Cheng, Eric Lo, Xuan Yang, Ming-Hay Luk, X...
Recently, there has been increasing interest in extending relational query processing to include data obtained from unstructured sources. A common approach is to use stand-alone I...
Daisy Zhe Wang, Michael J. Franklin, Minos N. Garo...
Entity resolution (ER) identifies database records that refer to the same real world entity. In practice, ER is not a one-time process, but is constantly improved as the data, sc...
Shortest path computation is one of the most common queries in location-based services that involve transportation networks. Motivated by scalability challenges faced in the mobil...
Prediction is emerging as an essential ingredient for real-time monitoring, planning and decision support applications such as intrusion detection, e-commerce pricing and automate...