The collection of behavior protocols is a common practice in human factors research, but the analysis of these large data sets has always been a tedious and time-consuming process. We are interested in automatically finding canonical behaviors: a small subset of behavioral protocols that is most representative of the full data set, providing a view of the data with as few protocols as possible. Behavior protocols often have a natural graph-based representation, yet there has been little work applying graph theory to their study. In this paper we extend our recent algorithm by taking into account the graph topology induced by the paths taken through the space of possible behaviors. We applied this technique to find canonical web-browsing behaviors for computer users. By comparing identified canonical sets to a ground truth determined by expert human coders, we found that this graph-based metric outperforms our previous metric based on edit distance.
Walter C. Mankowski, Peter Bogunovich, Ali Shokouf