This paper describes a hidden Markov model (HMM) based approach to perform search interface segmentation. Automatic processing of an interface is a must to access the invisible co...
Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches...
It is well understood that the key for successful Semantic Web applications depends on the availability of machine understandable meta-data. We describe the Information Grid, a pr...
In this paper we present a solution for “weaving the claim web”, i.e. the creation of knowledge networks via so-called claims stated in scientific publications created with th...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...