We analyze the structure and evolution of discussion cascades in four popular websites: Slashdot, Barrapunto, Meneame and Wikipedia. Despite the big heterogeneities between these sites, a simple preferential attachment (PA) model with bias to the root can capture the temporal evolution of the observed trees and many of their statistical properties, namely, probability distributions of the branching factors (degrees), subtree sizes and certain correlations. The parameters of the model are learned efficiently using a novel maximum likelihood estimation scheme for PA and provide a figurative interpretation about the communication habits and the resulting discussion cascades on the four different websites. Categories and Subject Descriptors J.4 [Computer Applications]: Social and Behavioral Sciences--Sociology; G.2.2 [Mathematics of Computing]: Graph Theory--Network problems,Trees General Terms measurement, algorithms, human factors Keywords discussion cascades, preferential attachment, m...
Vicenç Gómez, Hilbert J. Kappen, And