This paper addresses several key issues in the ArnetMiner system, which aims at extracting and mining academic social networks. Specifically, the system focuses on: 1) Extracting ...
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zha...
Integrating information in multiple natural languages is a challenging task that often requires manually created linguistic resources such as a bilingual dictionary or examples of...
Merchants selling products on the Web often ask their customers to share their opinions and hands-on experiences on products they have purchased. Unfortunately, reading through al...
Data mining algorithms use various Trie and bitmap-based representations to optimize the support (i.e., frequency) counting performance. In this paper, we compare the memory requi...
Abstract--Randomization is a general technique for evaluating the significance of data analysis results. In randomizationbased significance testing, a result is considered to be in...