Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
This paper proposes a non-interactive system for reducing the level of OCR-induced typographical variation in large text collections, contemporary and historical. Text-Induced Corp...
We demonstrate the H3Viewer graph drawing library, which can be run from a standalone program or in conjunction with other programs such as SGI's Site Manager application. Our...
Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification tech...
Tie-Yan Liu, Yiming Yang, Hao Wan, Qian Zhou, Bin ...
We have created the first image search engine based entirely on faces. Using simple text queries such as "smiling men with blond hair and mustaches," users can search thr...