Aspect extraction is a central problem in sentiment analysis. Current methods either extract aspects without categorizing them, or extract and categorize them using unsupervised t...
We present a new edition of the Google Books Ngram Corpus, which describes how often words and phrases were used over a period of five centuries, in eight languages; it reflects...
Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper d...
John D. Burger, John C. Henderson, George Kim, Gui...
Many statistical M-estimators are based on convex optimization problems formed by the weighted sum of a loss function with a norm-based regularizer. We analyze the convergence rat...
Alekh Agarwal, Sahand Negahban, Martin J. Wainwrig...
Abstract—Correcting software defects accounts for a significant amount of resources such as time, money and personnel. To be able to focus testing efforts where needed the most,...
The voice activity detectors (VADs) based on statistical models have shown impressive performances especially when fairly precise statistical models are employed. Moreover, the ac...
Abstract. Activity inference based on object use has received considerable recent attention. Such inference requires statistical models that map activities to the objects used in p...
We present statistical models for morphological disambiguation in agglutinative languages, with a specific application to Turkish. Turkish presents an interesting problem for stati...
In this paper we investigate how to automatically determine if two document collections are written from different perspectives. By perspectives we mean a point of view, for examp...
Statistical models on full and partial rankings of n items are often of limited practical use for large n due to computational consideration. We explore the use of non-parametric ...