Extracting dense sub-components from graphs efficiently is an important objective in a wide range of application domains ranging from social network analysis to biological network...
Nan Wang, Srinivasan Parthasarathy, Kian-Lee Tan, ...
With ever-increasing amounts of graph data from disparate sources, there has been a strong need for exploiting significant graph patterns with user-specified objective functions. ...
Online bibliographic databases, such as DBLP in computer science and PubMed in medical sciences, contain abundant information about research publications in different fields. Each...
Yizhou Sun, Tianyi Wu, Zhijun Yin, Hong Cheng, Jia...
In the implementation of hosted business services, multiple tenants are often consolidated into the same database to reduce total cost of ownership. Common practice is to map mult...
Stefan Aulbach, Torsten Grust, Dean Jacobs, Alfons...
Current approaches to develop information extraction (IE) programs have largely focused on producing precise IE results. As such, they suffer from three major limitations. First, ...
Warren Shen, Pedro DeRose, Robert McCann, AnHai Do...
There has been a significant amount of excitement and recent work on column-oriented database systems ("column-stores"). These database systems have been shown to perfor...
We present SchemaScope, a system to derive Document Type Definitions and XML Schemas from corpora of sample XML documents. Tools are provided to visualize, clean, and refine exist...
How can we efficiently find a clustering, i.e. a concise description of the cluster structure, of a given data set which contains an unknown number of clusters of different shape ...
Today's query processing engines do not take advantage of the multiple occurrences of a relation in a query to improve performance. Instead, each instance is treated as a dis...
Yu Cao, Gopal C. Das, Chee Yong Chan, Kian-Lee Tan