Text clustering is most commonly treated as a fully automated task without user supervision. However, we can improve clustering performance using supervision in the form of pairwi...
A new technique to locate content-representing words for a given document image using representation of character shapes is described. A character shape code representation define...
Many valuable text databases on the web have non-crawlable contents that are "hidden" behind search interfaces. Metasearchers are helpful tools for searching over many s...
Existing research on news video analysis mainly concentrates on structure analysis, semantic concept detection, annotation and search. However, little work has been contributed to...
clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a ...
David M. Blei, Thomas L. Griffiths, Michael I. Jor...