Internet is a huge source of information. Search engines have indexed much of this information and are able to extract the relevant webpages that are related to a given query. Howe...
We propose a novel approach to find aliases of a given name from the web. We exploit a set of known names and their aliases as training data and extract lexical patterns that conv...
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Information extraction approaches are heavily used to gather product information on the Web, especially focusing on technical product specifications. If requesting different sour...
Linguists and geographers are more and more interested in route direction documents because they contain interesting motion descriptions and language patterns. A large number of s...
Xiao Zhang, Prasenjit Mitra, Sen Xu, Anuj R. Jaisw...