One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
This paper proposes a novel application of a statistical language model to opinionated document retrieval targeting weblogs (blogs). In particular, we explore the use of the trigg...
Traditional ranking mainly focuses on one type of data source, and effective modeling still relies on a sufficiently large number of labeled or supervised examples. However, in m...
Bo Wang, Jie Tang, Wei Fan, Songcan Chen, Zi Yang,...
As massive repositories of real-time human commentary, social media platforms have arguably evolved far beyond passive facilitation of online social interactions. Rapid analysis o...
We discover communities from social network data, and analyze the community evolution. These communities are inherent characteristics of human interaction in online social network...
Yu-Ru Lin, Yun Chi, Shenghuo Zhu, Hari Sundaram, B...