Metadata plays an important role in discovering, collecting, extracting and aggregating Web data. This paper proposes a method of constructing metadata for a specific topic. The m...
Human-quality text summarization systems are di cult to design, and even more di cult to evaluate, in part because documents can di er along several dimensions, such as length, wri...
Jade Goldstein, Mark Kantrowitz, Vibhu O. Mittal, ...
Many web documents refer to specific geographic localities and many people include geographic context in queries to web search engines. Standard web search engines treat the geogra...
Subodh Vaid, Christopher B. Jones, Hideo Joho, Mar...
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
Using language technology for text analysis and light-weight ontologies as a content-mediating level, we acquire indexing patterns from vast amounts of indexing data for Englishla...