Today’s one-pass analytics applications tend to be data-intensive in nature and require the ability to process high volumes of data efficiently. MapReduce is a popular programm...
Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGreg...
We present the design, implementation and evaluation of an algorithm that checks audit logs for compliance with privacy and security policies. The algorithm, which we name reduce,...
Traditional feature selection methods assume that the data are independent and identically distributed (i.i.d.). In real world, tremendous amounts of data are distributed in a net...
—Sensor networks are often redundant by design; this is often done in order to achieve reliability in information processing. In many cases, the redundancy relationships between ...
—Popular Internet services are hosted by multiple geographically distributed datacenters. The location of the datacenters has a direct impact on the services’ response times, c...