Using a ground truth extracted from the Wikipedia, and a ground truth created through manual assessment, we show that the apparent performance advantage seen in machine learning a...
We explore statistical properties of links within Wikipedia. We demonstrate that a simple algorithm can predict many of the links that would normally be added to a new article, wit...
Kelly Y. Itakura, Charles L. A. Clarke, Shlomo Gev...
A lot of the world’s knowledge is stored in books, which, as a result of recent mass-digitisation efforts, are increasingly available online. Search engines, such as Google Book...
This paper pursues the recently emerging paradigm of searching for entities that are embedded in Web pages. We utilize informationextraction techniques to identify entity candidat...
Julia Stoyanovich, Srikanta J. Bedathur, Klaus Ber...
Wikipedia is the largest monolithic repository of human knowledge. In addition to its sheer size, it represents a new encyclopedic paradigm by interconnecting articles through hyp...