The semi-structured information available in HTML and similar documents provide valuable information that can be used for information extraction applications. This information tog...
Large scale digitization projects have been conducted at the Internet Archive digital library to preserve cultural artifacts and to provide permanent access. The increasing amount...
Abstract. In knowledge bases, the open world assumption and the ability to express variables may lead to an answer redundancy problem. This problem occurs when the returned answers...
Latent Semantic Analysis is used in many research fields with several applications of classifications. We propose to improve LSA with additional semantic information found with s...
When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. ...