WINACS (Web-based Information Network Analysis for Computer Science) is a project that incorporates many recent, exciting developments in data sciences to construct a Web-based co...
Background: Data generated using `omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of...
Yu Guo, Armin Graber, Robert N. McBurney, Raji Bal...
The vast majority of the published skew estimation methods for scanned document images are for textual documents. These methods are based on the principle that the skew angles can...
Many important application areas of text classifiers demand high precision and it is common to compare prospective solutions to the performance of Naive Bayes. This baseline is us...
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...