The proliferation of digital libraries and the large amount of existing documents raise important issues in efficient handling of documents. Printed texts in documents need to be...
In spite of the initialization problem, the ExpectationMaximization (EM) algorithm is widely used for estimating the parameters in several data mining related tasks. Most popular ...
Chandan K. Reddy, Hsiao-Dong Chiang, Bala Rajaratn...
Document-centric XML collections contain text-rich documents, marked up with XML tags. The tags add lightweight semantics to the text. Querying such collections calls for a hybrid...
The rise of social interactions on the Web requires developing new methods of information organization and discovery. To that end, we propose a generative community-based probabil...
The monumental cost of health care, especially for chronic disease treatment, is quickly becoming unmanageable. This crisis has motivated the drive towards preventative medicine, ...
Darcy A. Davis, Nitesh V. Chawla, Nicholas Blumm, ...