Recent progress in information extraction technology has enabled a vast array of applications that rely on structured data that is embedded in natural-language text. In particular, the extraction of concepts from the Web—with their desired attributes—is important to provide applications with rich, structured access to information. In this paper, we focus on an important family of concepts, namely, entities (e.g., people or organizations) and their attributes, and study how to efficiently and effectively extract them from Web-accessible text documents. Unfortunately, information extraction over the Web is challenging for both quality and efficiency reasons. Regarding quality, many sources on the Web contain misleading or invalid information; furthermore, extraction systems often return incorrect data. Regarding efficiency, information extraction is a time-consuming process, often involving expensive text-processing steps. We present a top-k extraction processing approach that addr...