We study the design issues of data-centric XML documents where (1) there are no mixed contents, i.e., each element may have some subelements and attributes, or it may have a singl...
We predict stock markets using information containedin articles published on the Web. Mostly textual articles appearingin the leading and the most influential financial newspapers...
Web search engines like Google have made us all smarter by providing ready access to the world's knowledge whenever we need to look up a fact, learn about a topic or evaluate...
Abstract--Individual privacy will be at risk if a published data set is not properly deidentified. k-Anonymity is a major technique to deidentify a data set. Among a number of k-an...
Jiuyong Li, Raymond Chi-Wing Wong, Ada Wai-Chee Fu...
This paper introduces Clustera, an integrated computation and data management system. In contrast to traditional clustermanagement systems that target specific types of workloads,...
David J. DeWitt, Erik Paulson, Eric Robinson, Jeff...