This paper describes a novel approach to named entity (NE) tagging on degraded documents. NE tagging is the process of identifying salient text strings in unstructured text, corre...
Automated photo tagging is essential to make massive unlabeled photos searchable by text search engines. Conventional image annotation approaches, though working reasonably well o...
Lei Wu, Steven C. H. Hoi, Rong Jin, Jianke Zhu, Ne...
In this paper we describe a multi-strategy approach to improving semantic extraction from news video. Experiments show the value of careful parameter tuning, exploiting multiple fe...
Alexander G. Hauptmann, Ming-yu Chen, Michael G. C...
Web Page segmentation is a crucial step for many applications in Information Retrieval, such as text classification, de-duplication and full-text search. In this paper we describe...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...