While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
Metasearch engine, Comparison-shopping and Deep Web crawling applications need to extract search result records enwrapped in result pages returned from search engines in response ...
We present results from an experiment that studied the information search behavior of younger and older adults in a medical decision-making task. To study how different combinatio...
The popularity of social bookmarking sites has made them prime targets for spammers. Many of these systems require an administrator’s time and energy to manually filter or remo...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...