Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-...
Postcorrection of OCR-results for text documents is usually based on electronic dictionaries. When scanning texts from a specific thematic area, conventional dictionaries often m...
Christian M. Strohmaier, Christoph Ringlstetter, K...
Abstract-Wikipedia is an example of the collaborative, semi-structured data sets emerging on the Web. These data sets have large, nonuniform schema that require costly data integra...
Bryan Chan, Leslie Wu, Justin Talbot, Mike Cammara...
The amount of information available on the Web has increased rapidly, reaching levels that few would ever have imagined possible. We live in what could be called the "informa...
In this paper, we analyze whether dictionaries from the World Wide Web which contain phonetic notations, may support the rapid creation of pronunciation dictionaries within the sp...