Abstract. It is argued that the ability to generalise is the most important characteristic of learning and that generalisation may be achieved only if pattern recognition systems learn the rules of meta-knowledge rather than the labels of objects. A structure, called “tower of knowledge”, according to which knowledge may be organised, is proposed. A scheme of interpreting scenes using the tower of knowledge and aspects of utility theory is also proposed. Finally, it is argued that globally consistent solutions of labellings are neither possible, nor desirable for an artificial cognitive system.