XML is becoming a prevalent format for data exchange. Many XML documents have complex schemas that are not always known, and can vary widely between information sources and applica...
Eugene Agichtein, C. T. Howard Ho, Vanja Josifovsk...
We describe research carried out as part of a text summarisation project for the legal domain for which we use a new XML corpus of judgments of the UK House of Lords. These judgmen...
The AutoFeed system automatically extracts data from semistructured web sites. Previously, researchers have developed two types of supervised learning approaches for extracting we...
Language comprehension in humans is significantly constrained by memory, yet rapid, highly incremental, and capable of utilizing a wide range of contextual information to resolve ...
Roger P. Levy, Florencia Reali, Thomas L. Griffith...
The audio scene from broadcast soccer can be used for identifying highlights from the game. Audio cues derived from these sources provide valuable information about game events, a...