Tables are a universal idiom to present relational data. Billions of tables on Web pages express entity references, attributes and relationships. This representation of relational...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
Already in the early 14th century, the philosopher William of Ockham created the philosophical principle known as Ockham’s Razor, which can be translated from Latin as ”entiti...
To take the first step beyond keyword-based search toward entity-based search, suitable token spans ("spots") on documents must be identified as references to real-world...
Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, ...
The next wave in search technology will be driven by the identification, extraction, and exploitation of real-world entities represented in unstructured textual sources. Search sy...