Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
In a wide range of business areas dealing with text data streams, including CRM, knowledge management, and Web monitoring services, it is an important issue to discover topic tren...
Tagged data is rapidly becoming more available on the World Wide Web. Web sites which populate tagging services offer a good way for Internet users to share their knowledge. An in...
Dimensionality reduction is a statistical tool commonly used to map high-dimensional data into lower a dimensionality. The transformed data is typically more suitable for regressi...
Bill Kapralos, Nathan Mekuz, Agnieszka Kopinska, S...