In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
This paper describes SHOE, a set of Simple HTML Ontology Extensions which allow World-Wide Web authors to annotate their pages with semantic knowledge such as “I am a graduate s...
Sean Luke, Lee Spector, David Rager, James A. Hend...
Is it possible to use sense inventories to improve Web search results diversity for one word queries? To answer this question, we focus on two broad-coverage lexical resources of ...
The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. Since it represents a large portion of the structu...
Jayant Madhavan, David Ko, Lucja Kot, Vignesh Gana...
Indexing and retrieval of speech content in various forms such as broadcast news, customer care data and on-line media has gained a lot of interest for a wide range of application...
Dogan Can, Erica Cooper, Arnab Ghoshal, Martin Jan...