In this paper, we present an unsupervised hybrid model which combines statistical, lexical, linguistic, contextual, and temporal features in a generic EMbased framework to harvest...
Paraphrase recognition is a critical step for natural language interpretation. Accordingly, many NLP applications would benefit from high coverage knowledge bases of paraphrases. ...
Information distillation techniques are used to analyze and interpret large volumes of speech and text archives in multiple languages and produce structured information of interes...
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. There are current...