A central problem in the analysis of motion capture (Mo-
Cap) data is how to decompose motion sequences into primitives.
Ideally, a description in terms of primitives should
facilitate the recognition, synthesis, and characterization of
actions. We propose an unsupervised learning algorithm
for automatically decomposing joint movements in human
motion capture (MoCap) sequences into shift-invariant basis
functions. Our formulation models the time series data
of joint movements in actions as a sparse linear combination
of short basis functions (snippets), which are executed
(or “activated”) at different positions in time. Given a set of
MoCap sequences of different actions, our algorithm finds
the decomposition of MoCap sequences in terms of basis
functions and their activations in time. Using the tools of
L1 minimization, the procedure alternately solves two large
convex minimizations: Given the basis functions, a variant
of Orthogonal Matching Pursuit solves for the...