To be autonomous, intelligent robots must learn the foundations of commonsense knowledge from their own sensorimotor experience in the world. We describe four recent research results that show how a robot learning agent can bootstrap from the "blooming buzzing confusion" of the pixel level to a higher-level ontology including distinctive states, places, actions, and objects.