High dimensional data sets are encountered in many modern database applications. The usual approach is to construct a summary of the data set through a lossy compression technique...
Most queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this pa...
Yunhua Hu, Ya-nan Qian, Hang Li, Daxin Jiang, Jian...
Abstract--A method for explaining results of a regressionbased classifier is proposed. The data is clustered using a metric extracted from the classifier. This way, clusters found ...
We present a large-scale analysis of the content of weblogs dating back to the release of the Blogger program in 1999. Over one million blogs were analyzed from their conception t...
Abstract. The problem of clustering data can be formulated as a graph partitioning problem. In this setting, spectral methods for obtaining optimal solutions have received a lot of...
Marcus Weber, Wasinee Rungsarityotin, Alexander Sc...