This paper describes an implemented robotic agent architecture in which the environment, as sensed by the agent, is used to guide the recognition of spoken and gestural directives given by a human user. The agent recognizes these directives using a probabilistic language model that conditions probability estimates for possible directives on visually-, proprioceptively-, or otherwise-sensed properties of entities in its environment, and updates these probabilities when these properties change. The result is an agent that can discriminate against mis-recognized directives that do not ‘make sense’ in its representation of the current state of the world.