This paper describes how use the HTMLEditorKit to perform web data mining on stock statistics for listed firms. Our focus is on making use of the web to get information about comp...
Abstract. Hierarchical classification problems gained increasing attention within the machine learning community, and several methods for hierarchically structured taxonomies have...
This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retr...
Static index pruning techniques aim at removing from the posting lists of an inverted file the references to documents which are likely to be not relevant for answering user querie...
We describe a method for improving the precision of metasearch results based upon scoring the visual features of documents' surrogate representations. These surrogate scores ...
Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, ...