Government regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulation...
Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...
We present a framework for automatically summarizing social group activity over time. The problem is important in understanding large scale online social networks, which have dive...
In contextual advertising, estimating the number of impressions of an ad is critical in planning and budgeting advertising campaigns. However, producing this forecast, even within...
Xuerui Wang, Andrei Z. Broder, Marcus Fontoura, Va...
Storage techniques and queries over XML databases are being widely studied. Most works store XML documents in traditional DBMSs in order to take advantage of a well established te...