We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
High-dimensional collections of 0-1 data occur in many applications. The attributes in such data sets are typically considered to be unordered. However, in many cases there is a n...
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
A key challenge facing IT organizations today is their evolution towards adopting e-business practices that gives rise to the need for reengineering their underlying software syst...
Mohammad El-Ramly, Eleni Stroulia, Paul G. Sorenso...
Open source projects are gradually incorporating usability methods into their development practices, but there are still many unmet needs. One particular need for nearly any open ...
Michael Terry, Matthew Kay, Brad Van Vugt, Brandon...