The Web is now a huge information repository with a rich semantic structure that, however, is primarily addressed to human understanding rather than automated processing by a compu...
Linear Discriminant Analysis (LDA) has been a popular method for feature extracting and face recognition. As a supervised method, it requires manually labeled samples for training...
This paper explores the large-scale acquisition of sense-tagged examples for Word Sense Disambiguation (WSD). We have applied the "WordNet monosemous relatives" method t...
Vast amounts of text on the Web are unstructured and ungrammatical, such as classified ads, auction listings, forum postings, etc. We call such text “posts.” Despite their in...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...