The goal of semi-supervised learning (SSL) methods is to reduce the amount of labeled training data required by learning from both labeled and unlabeled instances. Macskassy and Provost [1] proposed the weighted-vote relational neighbor classifier (wvRN) as a simple yet effective baseline for semi-supervised learning on network data. It is similar to many recent graph-based SSL methods (e.g., [2], [3]) and is shown to be essentially the same as the Gaussian-field classifier proposed by Zhu et al. [4] and proves to be very effective on some benchmark network datasets. We describe another simple and intuitive semi-supervised learning method based on random graph walk that outperforms wvRN by a large margin on several benchmark datasets when very few labels are available. Additionally, we show that using authoritative instances as training seeds -- instances that arguably cost much less to label -- dramatically reduces the amount of labeled data required to achieve the same classification...
Frank Lin, William W. Cohen