We propose the hierarchical Dirichlet process (HDP), a nonparametric Bayesian model for clustering problems involving multiple groups of data. Each group of data is modeled with a...
Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, ...
Information resources on the Web like videos, images, and documents are increasingly becoming more “social” through user engagement via commenting systems. These commenting sy...
In this paper, we present a method for structuring a document according to the information present in its Table of Contents. The detection of the ToC as well as the determination ...
Computational resources for research in legal environments have historically implied remote access to large databases of legal documents such as case law, statutes, law reviews an...
Jack G. Conrad, Khalid Al-Kofahi, Ying Zhao, Georg...
– This paper describes a text categorization approach that is based on a combination of a newly designed text representation with a kNN classifier. The new text document represen...