We present a novel approach to non-rigid structure from motion (NRSFM) from an orthographic video sequence, based on a new interpretation of the problem. Existing approaches assume the object shape space is well-modeled by a linear subspace. Our approach only assumes that small neighborhoods of shapes are well-modeled with a linear subspace. This constrains the shapes to belong to a manifold of dimensionality equal to the number of degrees of freedom of the object. After showing that the problem is still overconstrained, we present a solution composed of a novel initialization algorithm, followed by a robust extension of the Locally Smooth Manifold Learning algorithm tailored to the NRSFM problem. We finally present some test cases where the linear basis method fails (and is actually not meant to work) while the proposed approach is successful.