We develop the distance dependent Chinese restaurant process (CRP), a flexible class of distributions over partitions that allows for nonexchangeability. This class can be used to...
The growth of the web has directly influenced the increase in the availability of relational data. One of the key problems in mining such data is computing the similarity between o...
Pradeep Muthukrishnan, Dragomir R. Radev, Qiaozhu ...
We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...
With product reviews growing in depth and becoming more numerous, it is growing challenge to acquire a comprehensive understanding of their contents, for both customers and produc...
The problem of simultaneously clustering columns and rows (coclustering) arises in important applications, such as text data mining, microarray analysis, and recommendation system...