The highly variable and dynamic word usage in social media presents serious challenges for both research and those commercial applications that are geared towards blogs or other u...
One of the most challenging problems in data manipulation in the future is to be able to e ciently handle very large databases but also multiple induced properties or generalizatio...
Microblogging today has become a very popular communication tool among Internet users. Millions of users share opinions on different aspects of life everyday. Therefore microblogg...
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
The increasing availability of large-scale location traces creates unprecedent opportunities to change the paradigm for knowledge discovery in transportation systems. A particular...
Yong Ge, Hui Xiong, Alexander Tuzhilin, Keli Xiao,...