Product data exchange is the precondition of business interoperation between Web-based firms. However, millions of small and medium sized enterprises (SMEs) encode their Web produ...
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...
This paper reports on the underlying IR problems encountered when indexing and searching with the Bulgarian language. For this language we propose a general light stemmer and demon...
We investigate the problem of ordering medical events in unstructured clinical narratives by learning to rank them based on their time of occurrence. We represent each medical eve...
Preethi Raghavan, Albert M. Lai, Eric Fosler-Lussi...
The representation of information collections needs to be optimized for human cognition. While documents often include rich visual components, collections, including personal coll...