We describe the design and implementation of Privacy Oracle, a system that reports on application leaks of user information via the network traffic that they send. Privacy Oracle ...
Jaeyeon Jung, Anmol Sheth, Ben Greenstein, David W...
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
Abstract-- Similarity join is a useful primitive operation underlying many applications, such as near duplicate Web page detection, data integration, and pattern recognition. Tradi...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Haichuan Sh...
Enterprise and web data processing and content aggregation systems often require extensive use of human-reviewed data (e.g. for training and monitoring machine learning-based appl...
Qi Su, Dmitry Pavlov, Jyh-Herng Chow, Wendell C. B...
XML has become one of the core technologies for contemporary business applications, especially web-based applications. To facilitate processing of diverse XML data, we propose an ...
Quanzhong Li, Michelle Y. Kim, Edward So, Steve Wo...