A Language of Life: Characterizing People Using Cell Phone Tracks

16 years 2 months ago

Download suffix.com

—Mobile devices can produce continuous streams of data which are often speciﬁc to the person carrying them. We show that cell phone tracks from the MIT Reality dataset can be used to reliably characterize individual people. This is done by treating each person’s data as a separate language by building a standard n-gram language model for each “author.” We then compute the perplexities of an unlabelled sample as based on each person’s language model. The sample is assigned to the user yielding the lowest perplexity score. This technique achieves 85% accuracy and can also be used for clustering. We also show how language models can also be used for predicting movement and propose metrics to measure the accuracy of the predictions. Finally, we develop an alternative method for identifying individuals by counting the subsequences in a sample which are unique to their authors. This is done by building a generalized sufﬁx tree of the training set and counting each subsequence f...

Alexy Khrabrov, George Cybenko

Real-time Traffic

Computer Science And Engineering | CSE 2009 | Language Models | N-gram Language Model | Person’s Language Model |

claim paper

Post Info
More Details (n/a)

Added	20 May 2010
Updated	20 May 2010
Type	Conference
Year	2009
Where	CSE
Authors	Alexy Khrabrov, George Cybenko

Comments (0)

Sciweavers

A Language of Life: Characterizing People Using Cell Phone Tracks

Computer Science And Engineering | CSE 2009 | Language Models | N-gram Language Model | Person’s Language Model |

Explore & Download

Productivity Tools

Sciweavers