Wikipedia has become an important source of information which is growing very rapidly. However, the existing infrastructure for querying this information is limited and often igno...
Huong Nguyen, Thanh Nguyen, Hoa Nguyen, Juliana Fr...
Wikipedia infoboxes is an example of a seemingly structured, yet extraordinarily heterogeneous dataset, where any given record has only a tiny fraction of all possible fields. Su...
The MapReduce distributed programming framework is very popular, but currently lacks the optimization techniques that have been standard with relational database systems for many ...
An important requirement for emerging applications which aim to locate and integrate content distributed over the Web is to identify pages that are relevant for a given domain or ...
Recent progress in information extraction technology has enabled a vast array of applications that rely on structured data that is embedded in natural-language text. In particular...