To take the first step beyond keyword-based search toward entity-based search, suitable token spans ("spots") on documents must be identified as references to real-world...
Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, ...
Most of recommender systems try to find items that are most relevant to the older choices of a given user. Here we focus on the "surprise me" query: A user may be bored ...
Tracking new topics, ideas, and "memes" across the Web has been an issue of considerable interest. Recent work has developed methods for tracking topic shifts over long ...
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
Email is one of the most prevalent communication tools today, and solving the email overload problem is pressingly urgent. A good way to alleviate email overload is to automatical...
A major source of information (often the most crucial and informative part) in scholarly articles from scientific journals, proceedings and books are the figures that directly pro...
Amr Ahmed, Eric P. Xing, William W. Cohen, Robert ...
The problem of ethnicity identification from names has a variety of important applications, including biomedical research, demographic studies, and marketing. Here we report on th...
Anurag Ambekar, Charles B. Ward, Jahangir Mohammed...
Search queries are typically very short, which means they are often underspecified or have senses that the user did not think of. A broad latent query aspect is a set of keywords ...
Learning probabilistic graphical models from high-dimensional datasets is a computationally challenging task. In many interesting applications, the domain dimensionality is such a...
In this paper, we consider the problem of combining link and content analysis for community detection from networked data, such as paper citation networks and Word Wide Web. Most ...