We propose a novel approach to find aliases of a given name from the web. We exploit a set of known names and their aliases as training data and extract lexical patterns that conv...
The Online Database of Interlinear Text (ODIN)1 is a database of interlinear text "snippets", harvested mostly from scholarly documents posted to the Web. Although large...
Abstract-Unstructured text represents a large fraction of the world's data. It often contain snippets of structured information within them (e.g., people's names and zip ...
Daisy Zhe Wang, Eirinaios Michelakis, Joseph M. He...
It is well known that utterances convey a great deal of information about the speaker in addition to their semantic content. One such type of information consists of cues to the s...