Wikipedia is becoming ever more popular. Linking between documents is typically provided in similar environments in order to achieve collaborative knowledge sharing. However, this ...
Darren Wei Che Huang, Yue Xu, Andrew Trotman, Shlo...
Abstract. Information Extraction, the process of eliciting data from natural language documents, usually relies on the ability to parse the document and then to detect the meaning ...
Current web search engines are retrospective in that they limit users to searches against already existing pages. Prospective search engines, on the other hand, allow users to upl...
We address the problem of integrating documents from different sources into a master catalog. This problem is pervasive in web marketplaces and portals. Current technology for aut...
Structured documents are commonly edited using a free-form editor. Even though every string is an acceptable input, it makes sense to maintain a structured representation of the e...