Easy access to the Web has led to increased potential for students cheating on assignments by plagiarising others’ work. By the same token, Web-based tools offer the potential f...
Raphael A. Finkel, Arkady B. Zaslavsky, Kriszti&aa...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
In using web search engines, there are cases where the name of the target object is unavailable, and the user can only give the visual descriptions of the object. The existing keyw...
Although existing work has explored both information extraction and community content creation, most research has focused on them in isolation. In contrast, we see the greatest le...
A large fraction of the useful web comprises of specification documents that largely consist of hattribute name, numeric valuei pairs embedded in text. Examples include product in...