We present an approach to document clustering based on winnowing fingerprints that achieved good values of effectiveness with considerable save in memory space and computation tim...
Topic representation mismatch is a key problem in topic-oriented summarization for the specified topic is usually too short to understand/interpret. This paper proposes a novel ad...
We have developed a computational framework to characterize social network dynamics in the blogosphere at individual, group and community levels. Such characterization could be us...
Munmun De Choudhury, Hari Sundaram, Ajita John, Do...
In automated text categorization, given a small number of labeled documents, it is very challenging, if not impossible, to build a reliable classifier that is able to achieve high...
Zenglin Xu, Rong Jin, Kaizhu Huang, Michael R. Lyu...
Mining different types of communities from web data have attracted a lot of research efforts in recent years. However, none of the existing community mining techniques has taken i...
Qiankun Zhao, Sourav S. Bhowmick, Xin Zheng, Kai Y...
We introduce a multi-stage ensemble framework, ErrorDriven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a ...
With the development of highly efficient graph data collection technology in many application fields, classification of graph data emerges as an important topic in the data mining...
Social Network Marketing techniques employ pre-existing social networks to increase brands or products awareness through word-of-mouth promotion. Full understanding of social netw...
Given a collection of complex, time-stamped events, how do we find patterns and anomalies? Events could be meetings with one or more persons with one or more agenda items at zero ...
Hanghang Tong, Yasushi Sakurai, Tina Eliassi-Rad, ...