Data-intensive applications are increasingly designed to execute on large computing clusters. Grouped aggregation is a core primitive of many distributed programming models, and i...
Large graph databases are commonly collected and analyzed in numerous domains. For reasons related to either space efficiency or for privacy protection (e.g., in the case of socia...
Given a set of multi-dimensional points, the skyline contains the best points according to any preference function that is monotone on all axes. In practice, applications that req...
Near-duplicate video clip (NDVC) detection is an important problem with a wide range of applications such as TV broadcast monitoring, video copyright enforcement, content-based vi...
Heng Tao Shen, Xiaofang Zhou, Zi Huang, Jie Shao, ...
Machine-learning algorithms are employed in a wide variety of applications to extract useful information from data sets, and many are known to suffer from superlinear increases in ...
Karthik Nagarajan, Brian Holland, Alan D. George, ...