Unsupervised learning can be used to extract image representations that are useful for various and diverse vision tasks. After noticing that most biological vision systems for interpreting static images are trained using disparity information, we developed an analogous framework for unsupervised learning. The output of our method is a model that can generate a vector representation or descriptor from any static image. However, the model is trained using pairs of consecutive video frames, which are used to find representations that are consistent with optical flow-derived objects, or ‘flobjects’. To demonstrate the flobject analysis framework, we extend the latent Dirichlet allocation bagof-words model to account for real-valued word-specific flow vectors and image-specific probabilistic associations between flow clusters and topics. We show that the static image representations extracted using our method can be used to achieve higher classification rates and better genera...