Verbal and non-verbal interaction capabilities for robots are often studied isolated from each other in current research trend because they largely contribute to different aspects of interaction. For a robot companion that needs to be both useful and social, however, these capabilities have to be considered in a unified, complex interaction context. In this paper we present two case studies in such a context that clearly reveal the strengths and limitations of these modalities and advocate their complementary benefits for human-robot interaction. Motivated by this evidence we propose a powerful interaction framework which addresses common features of interactional and propositional information instead of their differences, as popular in many other works in this field, and models them using one single principle: grounding.